In your final group assignment you have to analyse data about Airbnb listings and fit a model to predict the total cost for two people staying 4 nights in an AirBnB in a city. You can download AirBnB data from insideairbnb.com; it was originally scraped from airbnb.com.

The following Google sheet shows which cities you can use; please choose one of them and add your group name next to it, e.g., A7, B13. No city can have more than 2 groups per stream working on it; if this happens, I will allocate study groups to cities with the help of R’s sampling.

All of the listings are a GZ file, namely they are archive files compressed by the standard GNU zip (gzip) compression algorithm. You can download, save and extract the file if you wanted, but vroom::vroom() or readr::read_csv() can immediately read and extract this kind of a file. You should prefer vroom() as it is faster, but if vroom() is limited by a firewall, please use read_csv() instead.

vroom will download the *.gz zipped file, unzip, and provide you with the dataframe.

Even though there are many variables in the dataframe, here is a quick description of some of the variables collected, and you can find a data dictionary here

  • price = cost per night

  • property_type: type of accommodation (House, Apartment, etc.)

  • room_type:

    • Entire home/apt (guests have entire place to themselves)
    • Private room (Guests have private room to sleep, all other rooms shared)
    • Shared room (Guests sleep in room shared with others)
  • number_of_reviews: Total number of reviews for the listing

  • review_scores_rating: Average review score (0 - 100)

  • longitude , latitude: geographical coordinates to help us locate the listing

  • neighbourhood*: three variables on a few major neighbourhoods in each city

1 Exploratory Data Analysis (EDA)

In the R4DS Exploratory Data Analysis chapter, the authors state:

“Your goal during EDA is to develop an understanding of your data. The easiest way to do this is to use questions as tools to guide your investigation… EDA is fundamentally a creative process. And like most creative processes, the key to asking quality questions is to generate a large quantity of questions.”

Conduct a thorough EDA. Recall that an EDA involves three things:

  • Looking at the raw values.
    • dplyr::glimpse()
  • Computing summary statistics of the variables of interest, or finding NAs
    • mosaic::favstats()
    • skimr::skim()
  • Creating informative visualizations.
    • ggplot2::ggplot()
      • geom_histogram() or geom_density() for numeric continuous variables
      • geom_bar() or geom_col() for categorical variables
    • GGally::ggpairs() for scaterrlot/correlation matrix
      • Note that you can add transparency to points/density plots in the aes call, for example: aes(colour = gender, alpha = 0.4)

You may wish to have a level 1 header (#) for your EDA, then use level 2 sub-headers (##) to make sure you cover all three EDA bases. At a minimum you should address these questions:

  • How many variables/columns? How many rows/observations?
  • Which variables are numbers?
  • Which are categorical or factor variables (numeric or character variables with variables that have a fixed and known set of possible values?
  • What are the correlations between variables? Does each scatterplot support a linear relationship between variables? Do any of the correlations appear to be conditional on the value of a categorical variable?

At this stage, you may also find you want to use filter, mutate, arrange, select, or count. Let your questions lead you!

In all cases, please think about the message your plot is conveying. Don’t just say “This is my X-axis, this is my Y-axis”, but rather what’s the so what of the plot. Tell some sort of story and speculate about the differences in the patterns in no more than a paragraph.

2 Exploratory Data Analysis (EDA)

2.1 Exploring Raw Values

#glimpse function allowed us to see all the variables in the dataset and their types. We noticed that some numeric variables were categorised as character variables, e.g., price
glimpse(listings) 
Rows: 27,805
Columns: 74
$ id                                           <dbl> 24963, 322045, 402315, 47…
$ listing_url                                  <chr> "https://www.airbnb.com/r…
$ scrape_id                                    <dbl> 2.021093e+13, 2.021093e+1…
$ last_scraped                                 <date> 2021-09-29, 2021-09-28, …
$ name                                         <chr> "Heart of French Built Mu…
$ description                                  <chr> "The flat is located in t…
$ neighborhood_overview                        <chr> "It's Shanghai Music Conc…
$ picture_url                                  <chr> "https://a0.muscache.com/…
$ host_id                                      <dbl> 98203, 681552, 681552, 68…
$ host_url                                     <chr> "https://www.airbnb.com/u…
$ host_name                                    <chr> "Jia", "Leon", "Leon", "L…
$ host_since                                   <date> 2010-03-24, 2011-06-09, …
$ host_location                                <chr> "Shanghai, Shanghai, Chin…
$ host_about                                   <chr> "I am an architect, train…
$ host_response_time                           <chr> "N/A", "within an hour", …
$ host_response_rate                           <chr> "N/A", "100%", "100%", "1…
$ host_acceptance_rate                         <chr> "N/A", "100%", "100%", "1…
$ host_is_superhost                            <lgl> TRUE, TRUE, TRUE, TRUE, T…
$ host_thumbnail_url                           <chr> "https://a0.muscache.com/…
$ host_picture_url                             <chr> "https://a0.muscache.com/…
$ host_neighbourhood                           <chr> "Conservatory", "Changsho…
$ host_listings_count                          <dbl> 2, 16, 16, 16, 16, 16, 16…
$ host_total_listings_count                    <dbl> 2, 16, 16, 16, 16, 16, 16…
$ host_verifications                           <chr> "['email', 'phone', 'revi…
$ host_has_profile_pic                         <lgl> TRUE, TRUE, TRUE, TRUE, T…
$ host_identity_verified                       <lgl> TRUE, TRUE, TRUE, TRUE, T…
$ neighbourhood                                <chr> "Shanghai, China", "Shang…
$ neighbourhood_cleansed                       <chr> "徐汇区 / Xuhui District"…
$ neighbourhood_group_cleansed                 <lgl> NA, NA, NA, NA, NA, NA, N…
$ latitude                                     <dbl> 31.21073, 31.24240, 31.24…
$ longitude                                    <dbl> 121.4516, 121.4445, 121.4…
$ property_type                                <chr> "Entire rental unit", "En…
$ room_type                                    <chr> "Entire home/apt", "Entir…
$ accommodates                                 <dbl> 3, 2, 2, 2, 2, 2, 3, 2, 1…
$ bathrooms                                    <lgl> NA, NA, NA, NA, NA, NA, N…
$ bathrooms_text                               <chr> "1 bath", "1 bath", "1 ba…
$ bedrooms                                     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ beds                                         <dbl> 2, 1, 1, 1, 1, 1, 2, 1, 1…
$ amenities                                    <chr> "[\"Smoke alarm\", \"Sham…
$ price                                        <chr> "$480.00", "$464.00", "$4…
$ minimum_nights                               <dbl> 3, 1, 1, 1, 1, 1, 1, 1, 3…
$ maximum_nights                               <dbl> 365, 1125, 1125, 1125, 11…
$ minimum_minimum_nights                       <dbl> 3, 1, 1, 1, 1, 1, 1, 1, 3…
$ maximum_minimum_nights                       <dbl> 3, 1, 1, 1, 1, 1, 1, 1, 3…
$ minimum_maximum_nights                       <dbl> 365, 1125, 1125, 1125, 11…
$ maximum_maximum_nights                       <dbl> 365, 1125, 1125, 1125, 11…
$ minimum_nights_avg_ntm                       <dbl> 3, 1, 1, 1, 1, 1, 1, 1, 3…
$ maximum_nights_avg_ntm                       <dbl> 365, 1125, 1125, 1125, 11…
$ calendar_updated                             <lgl> NA, NA, NA, NA, NA, NA, N…
$ has_availability                             <lgl> TRUE, TRUE, TRUE, TRUE, T…
$ availability_30                              <dbl> 0, 0, 0, 25, 0, 21, 22, 2…
$ availability_60                              <dbl> 0, 0, 28, 55, 0, 51, 52, …
$ availability_90                              <dbl> 0, 0, 58, 85, 0, 81, 82, …
$ availability_365                             <dbl> 240, 242, 333, 360, 41, 3…
$ calendar_last_scraped                        <date> 2021-09-29, 2021-09-28, …
$ number_of_reviews                            <dbl> 85, 42, 27, 28, 34, 77, 3…
$ number_of_reviews_ltm                        <dbl> 0, 0, 7, 0, 0, 10, 14, 10…
$ number_of_reviews_l30d                       <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0…
$ first_review                                 <date> 2012-10-15, 2014-12-14, …
$ last_review                                  <date> 2019-11-22, 2017-11-13, …
$ review_scores_rating                         <dbl> 4.74, 4.78, 4.69, 4.36, 4…
$ review_scores_accuracy                       <dbl> 4.87, 4.64, 4.76, 4.23, 4…
$ review_scores_cleanliness                    <dbl> 4.54, 4.52, 4.72, 4.12, 4…
$ review_scores_checkin                        <dbl> 4.77, 4.80, 4.72, 4.50, 4…
$ review_scores_communication                  <dbl> 4.70, 4.86, 4.96, 4.65, 4…
$ review_scores_location                       <dbl> 4.86, 4.59, 4.80, 4.58, 4…
$ review_scores_value                          <dbl> 4.76, 4.56, 4.76, 4.42, 4…
$ license                                      <lgl> NA, NA, NA, NA, NA, NA, N…
$ instant_bookable                             <lgl> FALSE, TRUE, TRUE, TRUE, …
$ calculated_host_listings_count               <dbl> 1, 16, 16, 16, 16, 16, 16…
$ calculated_host_listings_count_entire_homes  <dbl> 1, 16, 16, 16, 16, 16, 16…
$ calculated_host_listings_count_private_rooms <dbl> 0, 0, 0, 0, 0, 0, 0, 2, 1…
$ calculated_host_listings_count_shared_rooms  <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0…
$ reviews_per_month                            <dbl> 0.78, 0.51, 0.24, 0.25, 0…
#This function gave us an insignt into the missing values and summary statistics for each variable 
skim(listings) 
Data summary
Name listings
Number of rows 27805
Number of columns 74
_______________________
Column type frequency:
character 23
Date 5
logical 9
numeric 37
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
listing_url 0 1.00 34 37 0 27805 0
name 0 1.00 1 113 0 26678 0
description 1187 0.96 1 1000 0 22493 0
neighborhood_overview 4778 0.83 1 1000 0 15123 0
picture_url 0 1.00 61 126 0 26684 0
host_url 0 1.00 39 43 0 7561 0
host_name 17 1.00 1 42 0 5557 0
host_location 33 1.00 2 51 0 171 0
host_about 11312 0.59 1 3298 0 3849 22
host_response_time 12 1.00 3 18 0 5 0
host_response_rate 12 1.00 2 4 0 53 0
host_acceptance_rate 12 1.00 2 4 0 82 0
host_thumbnail_url 12 1.00 55 106 0 7530 0
host_picture_url 12 1.00 57 109 0 7530 0
host_neighbourhood 13506 0.51 2 31 0 101 0
host_verifications 0 1.00 2 179 0 323 0
neighbourhood 4778 0.83 15 35 0 25 0
neighbourhood_cleansed 0 1.00 13 24 0 16 0
property_type 0 1.00 4 35 0 89 0
room_type 0 1.00 10 15 0 4 0
bathrooms_text 55 1.00 6 17 0 72 0
amenities 0 1.00 16 1304 0 21170 0
price 0 1.00 5 10 0 3289 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
last_scraped 0 1.00 2021-09-28 2021-10-05 2021-09-29 4
host_since 12 1.00 2010-03-24 2021-09-27 2018-12-18 2474
calendar_last_scraped 0 1.00 2021-09-28 2021-10-05 2021-09-29 4
first_review 10374 0.63 2012-07-03 2021-09-29 2020-01-13 1933
last_review 10374 0.63 2012-07-07 2021-09-30 2021-03-26 1519

Variable type: logical

skim_variable n_missing complete_rate mean count
host_is_superhost 12 1 0.36 FAL: 17690, TRU: 10103
host_has_profile_pic 12 1 1.00 TRU: 27756, FAL: 37
host_identity_verified 12 1 1.00 TRU: 27781, FAL: 12
neighbourhood_group_cleansed 27805 0 NaN :
bathrooms 27805 0 NaN :
calendar_updated 27805 0 NaN :
has_availability 0 1 1.00 TRU: 27802, FAL: 3
license 27805 0 NaN :
instant_bookable 0 1 0.66 TRU: 18326, FAL: 9479

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
id 0 1.00 4.030854e+07 9889734.37 2.496300e+04 3.510384e+07 4.314768e+07 4.825886e+07 5.250628e+07 ▁▁▂▅▇
scrape_id 0 1.00 2.021093e+13 0.00 2.021093e+13 2.021093e+13 2.021093e+13 2.021093e+13 2.021093e+13 ▁▁▇▁▁
host_id 0 1.00 2.236987e+08 115231469.53 9.820300e+04 1.308527e+08 2.312615e+08 3.189058e+08 4.247836e+08 ▆▇▇▇▇
host_listings_count 12 1.00 2.395000e+01 76.00 0.000000e+00 1.000000e+00 6.000000e+00 1.600000e+01 1.100000e+03 ▇▁▁▁▁
host_total_listings_count 12 1.00 2.395000e+01 76.00 0.000000e+00 1.000000e+00 6.000000e+00 1.600000e+01 1.100000e+03 ▇▁▁▁▁
latitude 0 1.00 3.120000e+01 0.14 3.071000e+01 3.114000e+01 3.120000e+01 3.123000e+01 3.183000e+01 ▁▅▇▁▁
longitude 0 1.00 1.215100e+02 0.17 1.208600e+02 1.214400e+02 1.214900e+02 1.216600e+02 1.219400e+02 ▁▁▇▆▁
accommodates 0 1.00 3.780000e+00 3.43 0.000000e+00 2.000000e+00 2.000000e+00 4.000000e+00 1.600000e+01 ▇▃▁▁▁
bedrooms 975 0.96 1.750000e+00 1.85 1.000000e+00 1.000000e+00 1.000000e+00 2.000000e+00 5.000000e+01 ▇▁▁▁▁
beds 212 0.99 2.230000e+00 2.71 0.000000e+00 1.000000e+00 1.000000e+00 2.000000e+00 5.000000e+01 ▇▁▁▁▁
minimum_nights 0 1.00 6.740000e+00 32.99 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+03 ▇▁▁▁▁
maximum_nights 0 1.00 8.670600e+02 436.07 1.000000e+00 3.650000e+02 1.125000e+03 1.125000e+03 1.999900e+04 ▇▁▁▁▁
minimum_minimum_nights 0 1.00 6.630000e+00 32.69 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+03 ▇▁▁▁▁
maximum_minimum_nights 0 1.00 6.860000e+00 33.40 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+03 ▇▁▁▁▁
minimum_maximum_nights 0 1.00 9.347700e+02 392.87 1.000000e+00 1.125000e+03 1.125000e+03 1.125000e+03 1.999900e+04 ▇▁▁▁▁
maximum_maximum_nights 0 1.00 9.364400e+02 391.40 1.000000e+00 1.125000e+03 1.125000e+03 1.125000e+03 1.999900e+04 ▇▁▁▁▁
minimum_nights_avg_ntm 0 1.00 6.740000e+00 32.90 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+00 1.000000e+03 ▇▁▁▁▁
maximum_nights_avg_ntm 0 1.00 9.361200e+02 391.44 1.000000e+00 1.125000e+03 1.125000e+03 1.125000e+03 1.999900e+04 ▇▁▁▁▁
availability_30 0 1.00 1.912000e+01 10.80 0.000000e+00 6.000000e+00 2.400000e+01 2.800000e+01 3.000000e+01 ▅▁▁▅▇
availability_60 0 1.00 4.472000e+01 18.09 0.000000e+00 3.500000e+01 5.300000e+01 5.800000e+01 6.000000e+01 ▁▁▂▁▇
availability_90 0 1.00 7.099000e+01 25.64 0.000000e+00 6.500000e+01 8.300000e+01 8.700000e+01 9.000000e+01 ▁▁▁▂▇
availability_365 0 1.00 2.493600e+02 126.53 0.000000e+00 9.200000e+01 3.360000e+02 3.600000e+02 3.650000e+02 ▂▂▂▁▇
number_of_reviews 0 1.00 1.234000e+01 29.52 0.000000e+00 0.000000e+00 2.000000e+00 1.000000e+01 4.580000e+02 ▇▁▁▁▁
number_of_reviews_ltm 0 1.00 3.940000e+00 8.80 0.000000e+00 0.000000e+00 0.000000e+00 4.000000e+00 1.510000e+02 ▇▁▁▁▁
number_of_reviews_l30d 0 1.00 1.900000e-01 0.70 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 1.500000e+01 ▇▁▁▁▁
review_scores_rating 10374 0.63 4.630000e+00 0.92 0.000000e+00 4.710000e+00 4.940000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
review_scores_accuracy 10852 0.61 4.830000e+00 0.45 1.000000e+00 4.860000e+00 5.000000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
review_scores_cleanliness 10852 0.61 4.780000e+00 0.47 1.000000e+00 4.750000e+00 4.960000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
review_scores_checkin 10854 0.61 4.860000e+00 0.42 1.000000e+00 4.900000e+00 5.000000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
review_scores_communication 10852 0.61 4.880000e+00 0.40 1.000000e+00 4.930000e+00 5.000000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
review_scores_location 10854 0.61 4.840000e+00 0.39 1.000000e+00 4.830000e+00 5.000000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
review_scores_value 10854 0.61 4.760000e+00 0.50 1.000000e+00 4.750000e+00 4.930000e+00 5.000000e+00 5.000000e+00 ▁▁▁▁▇
calculated_host_listings_count 0 1.00 1.644000e+01 28.05 1.000000e+00 3.000000e+00 9.000000e+00 1.700000e+01 2.220000e+02 ▇▁▁▁▁
calculated_host_listings_count_entire_homes 0 1.00 1.037000e+01 21.21 0.000000e+00 1.000000e+00 3.000000e+00 1.000000e+01 1.680000e+02 ▇▁▁▁▁
calculated_host_listings_count_private_rooms 0 1.00 5.880000e+00 14.04 0.000000e+00 0.000000e+00 1.000000e+00 7.000000e+00 1.280000e+02 ▇▁▁▁▁
calculated_host_listings_count_shared_rooms 0 1.00 2.000000e-01 1.92 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 4.000000e+01 ▇▁▁▁▁
reviews_per_month 10374 0.63 9.200000e-01 1.12 1.000000e-02 1.900000e-01 5.000000e-01 1.180000e+00 1.263000e+01 ▇▁▁▁▁
# This function allows to convert character type data into a numeric. We do this for price
listings <- listings %>% 
  mutate(price = parse_number(price))

# We check if the conversion was successful
typeof(listings$price)
[1] "double"
# We noriced that the bathroom variable is mostly text, hence we convert it to a numeric using parse function
listings <- listings %>% 
  mutate(bathrooms = parse_number(bathrooms_text))
favstats(listings$bathrooms)
minQ1medianQ3maxmeansdnmissing
0111.5501.571.7227429376

2.2 Data wrangling

Once you load the data, it’s always a good idea to use glimpse to see what kind of variables you have and what data type (chr, num, logical, date, etc) they are.

Notice that some of the price data (price) is given as a character string, e.g., “$176.00”

Since price is a quantitative variable, we need to make sure it is stored as numeric data num in the dataframe. To do so, we will first use readr::parse_number() which drops any non-numeric characters before or after the first number

listings <- listings %>% 
  mutate(price = parse_number(price))

Use typeof(listing$price) to confirm that price is now stored as a number.

2.3 Propery types

Next, we look at the variable property_type. We can use the count function to determine how many categories there are their frequency. What are the top 4 most common property types? What proportion of the total listings do they make up?

property_type_by_proportion <- listings %>% 
  count(property_type) %>% 
  arrange(desc(n)) %>% 
  mutate(proportion = n/sum(n)*100)

property_type_by_proportion
property_typenproportion
Entire rental unit662023.8    
Private room in villa326111.7    
Entire residential home20657.43   
Entire villa19236.92   
Private room in rental unit18786.75   
Entire condominium (condo)16565.96   
Entire loft15545.59   
Private room in residential home15015.4    
Entire serviced apartment13574.88   
Private room in serviced apartment8433.03   
Private room in condominium (condo)5952.14   
Private room in kezhan4761.71   
Room in boutique hotel4631.67   
Private room in farm stay4231.52   
Private room in bed and breakfast3331.2    
Private room in townhouse2791      
Room in hotel2740.985  
Shared room in rental unit2520.906  
Private room in loft2010.723  
Farm stay1890.68   
Private room in cottage1890.68   
Entire townhouse1700.611  
Shared room in hostel1490.536  
Entire cottage1280.46   
Private room in resort1190.428  
Private room in hostel1080.388  
Room in aparthotel1050.378  
Private room in guesthouse810.291  
Shared room in condominium (condo)770.277  
Entire guest suite480.173  
Shared room in residential home430.155  
Private room in guest suite390.14   
Entire bungalow290.104  
Entire guesthouse290.104  
Private room in bungalow230.0827 
Tiny house210.0755 
Shared room in villa190.0683 
Shared room in loft180.0647 
Tent170.0611 
Private room in tiny house150.0539 
Shared room in guesthouse150.0539 
Entire cabin140.0504 
Private room in cabin140.0504 
Shared room in townhouse140.0504 
Shared room in boutique hotel120.0432 
Private room110.0396 
Private room in minsu110.0396 
Earth house90.0324 
Entire chalet90.0324 
Entire home/apt90.0324 
Shared room in bed and breakfast90.0324 
Entire place80.0288 
Private room in earth house80.0288 
Shared room in serviced apartment80.0288 
Entire bed and breakfast50.018  
Private room in barn50.018  
Private room in castle50.018  
Private room in nature lodge50.018  
Shared room in guest suite50.018  
Shared room in tent50.018  
Kezhan40.0144 
Minsu40.0144 
Private room in casa particular40.0144 
Camper/RV30.0108 
Castle30.0108 
Floor30.0108 
Private room in chalet30.0108 
Private room in tent30.0108 
Shared room in farm stay30.0108 
Casa particular20.00719
Religious building20.00719
Shared room in aparthotel20.00719
Shared room in casa particular20.00719
Barn10.0036 
Campsite10.0036 
Entire hostel10.0036 
Nature lodge10.0036 
Pension10.0036 
Private room in boat10.0036 
Private room in camper/rv10.0036 
Private room in dome house10.0036 
Private room in houseboat10.0036 
Private room in ranch10.0036 
Riad10.0036 
Shared room10.0036 
Shared room in barn10.0036 
Shared room in bungalow10.0036 
Shared room in kezhan10.0036 
Treehouse10.0036 

Since the vast majority of the observations in the data are one of the top four or five property types, we would like to create a simplified version of property_type variable that has 5 categories: the top four categories and Other. Fill in the code below to create prop_type_simplified.

listings <- listings %>%
  mutate(prop_type_simplified = case_when(
    property_type %in% c("Entire rental unit","Private room in villa", "Entire residential home","Entire villa") ~ property_type, 
    TRUE ~ "Other"
  ))

listings %>%
  count(property_type, prop_type_simplified) %>%
  arrange(desc(n)) 
property_typeprop_type_simplifiedn
Entire rental unitEntire rental unit6620
Private room in villaPrivate room in villa3261
Entire residential homeEntire residential home2065
Entire villaEntire villa1923
Private room in rental unitOther1878
Entire condominium (condo)Other1656
Entire loftOther1554
Private room in residential homeOther1501
Entire serviced apartmentOther1357
Private room in serviced apartmentOther843
Private room in condominium (condo)Other595
Private room in kezhanOther476
Room in boutique hotelOther463
Private room in farm stayOther423
Private room in bed and breakfastOther333
Private room in townhouseOther279
Room in hotelOther274
Shared room in rental unitOther252
Private room in loftOther201
Farm stayOther189
Private room in cottageOther189
Entire townhouseOther170
Shared room in hostelOther149
Entire cottageOther128
Private room in resortOther119
Private room in hostelOther108
Room in aparthotelOther105
Private room in guesthouseOther81
Shared room in condominium (condo)Other77
Entire guest suiteOther48
Shared room in residential homeOther43
Private room in guest suiteOther39
Entire bungalowOther29
Entire guesthouseOther29
Private room in bungalowOther23
Tiny houseOther21
Shared room in villaOther19
Shared room in loftOther18
TentOther17
Private room in tiny houseOther15
Shared room in guesthouseOther15
Entire cabinOther14
Private room in cabinOther14
Shared room in townhouseOther14
Shared room in boutique hotelOther12
Private roomOther11
Private room in minsuOther11
Earth houseOther9
Entire chaletOther9
Entire home/aptOther9
Shared room in bed and breakfastOther9
Entire placeOther8
Private room in earth houseOther8
Shared room in serviced apartmentOther8
Entire bed and breakfastOther5
Private room in barnOther5
Private room in castleOther5
Private room in nature lodgeOther5
Shared room in guest suiteOther5
Shared room in tentOther5
KezhanOther4
MinsuOther4
Private room in casa particularOther4
Camper/RVOther3
CastleOther3
FloorOther3
Private room in chaletOther3
Private room in tentOther3
Shared room in farm stayOther3
Casa particularOther2
Religious buildingOther2
Shared room in aparthotelOther2
Shared room in casa particularOther2
BarnOther1
CampsiteOther1
Entire hostelOther1
Nature lodgeOther1
PensionOther1
Private room in boatOther1
Private room in camper/rvOther1
Private room in dome houseOther1
Private room in houseboatOther1
Private room in ranchOther1
RiadOther1
Shared roomOther1
Shared room in barnOther1
Shared room in bungalowOther1
Shared room in kezhanOther1
TreehouseOther1
#this function allows us to get an insight into max, min, mean, meadian values
favstats(listings$minimum_nights) 
minQ1medianQ3maxmeansdnmissing
11111e+036.7433278050
#this chunk of code builds a density chart for the values and gives an idea of where the most common value is

listings %>% 
  ggplot(aes(x=minimum_nights))+
  geom_density()+
  NULL 

#this chunk of code allows to break down each minimum night value by frequency
listings %>% 
  count(minimum_nights) %>% 
  arrange(desc(n))
minimum_nightsn
123727
2980
30919
3456
7322
90199
5177
180148
15123
365117
60102
1469
1066
2055
2850
442
3140
10031
12027
36020
617
2510
359
338
508
5007
3006
1505
2005
84
564
1854
123
403
453
2103
2703
112
132
912
922
1832
2402
91
161
171
191
211
231
261
321
381
621
701
801
931
1011
1091
1301
1521
1881
1901
5551
1e+031
# this code filters out all long term listings
short_term_listings <- listings %>% 
  filter(minimum_nights <=4)

Airbnb is most commonly used for travel purposes, i.e., as an alternative to traditional hotels. We only want to include listings in our regression analysis that are intended for travel purposes:

  • What are the most common values for the variable minimum_nights?

The most common value is 1 night

  • Is ther any value among the common values that stands out?

The value that stands out is the biggest value in this collumn - 1000. We also notices that some values are 365 and 180 days

  • What is the likely intended purpose for Airbnb listings with this seemingly unusual value for minimum_nights?

We believe that the reason for the 1000 night value is to prevent AirBnb user from booking the room throuhg the AirBnb system. In order to book a room, the user will have to contact the host directly. This is beneficial to the host because he/she bypasses the AirBnb commission

When it comes to 365 and 180 values, these indicate that the host is looking for a long term renter

Filter the airbnb data so that it only includes observations with minimum_nights <= 4

3 Mapping

Visualisations of feature distributions and their relations are key to understanding a data set, and they can open up new lines of exploration. While we do not have time to go into all the wonderful geospatial visualisations one can do with R, you can use the following code to start with a map of your city, and overlay all AirBnB coordinates to get an overview of the spatial distribution of AirBnB rentals. For this visualisation we use the leaflet package, which includes a variety of tools for interactive maps, so you can easily zoom in-out, click on a point to get the actual AirBnB listing for that specific point, etc.

The following code, having downloaded a dataframe listings with all AirbnB listings in Milan, will plot on the map all AirBnBs where minimum_nights is less than equal to four (4). You could learn more about leaflet, by following the relevant Datacamp course on mapping with leaflet

leaflet(data = filter(listings, minimum_nights <= 4)) %>% 
  addProviderTiles("OpenStreetMap.Mapnik") %>% 
  addCircleMarkers(lng = ~longitude, 
                   lat = ~latitude, 
                   radius = 1, 
                   fillColor = "blue", 
                   fillOpacity = 0.4, 
                   popup = ~listing_url,
                   label = ~property_type)

4 Regression Analysis

For the target variable \(Y\), we will use the cost for two people to stay at an Airbnb location for four (4) nights.

Create a new variable called price_4_nights that uses price, and accomodates to calculate the total cost for two people to stay at the Airbnb property for 4 nights. This is the variable \(Y\) we want to explain.

Use histograms or density plots to examine the distributions of price_4_nights and log(price_4_nights). Which variable should you use for the regression model? Why?

For the regression model we should use log of price_4_nights variable, because it is normally distribution. - spend some time on this

Fit a regression model called model1 with the following explanatory variables: prop_type_simplified, number_of_reviews, and review_scores_rating.

  • Interpret the coefficient review_scores_rating in terms of price_4_nights.

The coefficient for the review_scores_rating suggests that the higher are the ratings, the pricier is the apartment - go back to this

  • Interpret the coefficient of prop_type_simplified in terms of price_4_nights.

Coefficients for the property_type_simplified suggests that the property type has a statistically significant effect on the price. In particular, if the property type is Entire Villa or Other, it tends to be more expensive

We want to determine if room_type is a significant predictor of the cost for 4 nights, given everything else in the model. Fit a regression model called model2 that includes all of the explananatory variables in model1 plus room_type.

The model 2 shows that the room type is a significant predictor for the price. More specifically, private and shared rooms are less expensive

# This code builds a dataset that only contains accommodations that can host two people and creates a variable for the 4-nights-stay price. It also creates a variable that is a log10 of the price for 4 nigths

short_term_listings_for_2 <- short_term_listings %>% 
  filter(accommodates ==2) %>% 
  mutate(price_4_nights = price*4) %>% 
  mutate(log_price_4_nights = log(price_4_nights,10))
  
  # This code builds a histogram for the room price for 4 nights per 2 people in Shanghai.
  short_term_listings_for_2 %>% 
  ggplot(aes(x=log_price_4_nights)) +
  geom_histogram()+
  NULL

#This bit of code builds a regression model - model1

model1 <- lm(log_price_4_nights ~ prop_type_simplified + review_scores_rating + number_of_reviews, data=  short_term_listings_for_2)
  
summary(model1)

Call:
lm(formula = log_price_4_nights ~ prop_type_simplified + review_scores_rating + 
    number_of_reviews, data = short_term_listings_for_2)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.90326 -0.14583 -0.00672  0.13893  2.19988 

Coefficients:
                                              Estimate Std. Error t value
(Intercept)                                  3.159e+00  1.577e-02 200.286
prop_type_simplifiedEntire residential home  7.655e-03  1.231e-02   0.622
prop_type_simplifiedEntire villa             6.534e-02  1.780e-02   3.670
prop_type_simplifiedOther                   -9.489e-02  6.683e-03 -14.200
prop_type_simplifiedPrivate room in villa   -8.682e-03  8.711e-03  -0.997
review_scores_rating                         1.852e-02  3.235e-03   5.727
number_of_reviews                            5.252e-05  7.563e-05   0.695
                                            Pr(>|t|)    
(Intercept)                                  < 2e-16 ***
prop_type_simplifiedEntire residential home 0.533980    
prop_type_simplifiedEntire villa            0.000244 ***
prop_type_simplifiedOther                    < 2e-16 ***
prop_type_simplifiedPrivate room in villa   0.318958    
review_scores_rating                        1.06e-08 ***
number_of_reviews                           0.487386    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2376 on 7853 degrees of freedom
  (4449 observations deleted due to missingness)
Multiple R-squared:  0.04613,   Adjusted R-squared:  0.0454 
F-statistic: 63.29 on 6 and 7853 DF,  p-value: < 2.2e-16
#This code gives an insight into the room_type variable

unique(short_term_listings_for_2$room_type)
[1] "Entire home/apt" "Private room"    "Shared room"    
#This code builds a model that includes the room type

model2 <- lm(log_price_4_nights ~ prop_type_simplified + review_scores_rating + number_of_reviews + room_type, data=  short_term_listings_for_2)
  
summary(model2)

Call:
lm(formula = log_price_4_nights ~ prop_type_simplified + review_scores_rating + 
    number_of_reviews + room_type, data = short_term_listings_for_2)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.90439 -0.14246 -0.01569  0.12353  2.20387 

Coefficients:
                                              Estimate Std. Error t value
(Intercept)                                  3.163e+00  1.527e-02 207.140
prop_type_simplifiedEntire residential home  8.123e-03  1.190e-02   0.682
prop_type_simplifiedEntire villa             6.428e-02  1.722e-02   3.733
prop_type_simplifiedOther                    1.041e-02  8.122e-03   1.282
prop_type_simplifiedPrivate room in villa    1.459e-01  1.133e-02  12.872
review_scores_rating                         1.817e-02  3.131e-03   5.804
number_of_reviews                           -3.433e-05  7.324e-05  -0.469
room_typePrivate room                       -1.544e-01  7.588e-03 -20.351
room_typeShared room                        -4.020e-01  2.691e-02 -14.938
                                            Pr(>|t|)    
(Intercept)                                  < 2e-16 ***
prop_type_simplifiedEntire residential home 0.494997    
prop_type_simplifiedEntire villa            0.000191 ***
prop_type_simplifiedOther                   0.200032    
prop_type_simplifiedPrivate room in villa    < 2e-16 ***
review_scores_rating                        6.74e-09 ***
number_of_reviews                           0.639307    
room_typePrivate room                        < 2e-16 ***
room_typeShared room                         < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2298 on 7851 degrees of freedom
  (4449 observations deleted due to missingness)
Multiple R-squared:  0.108, Adjusted R-squared:  0.1071 
F-statistic: 118.8 on 8 and 7851 DF,  p-value: < 2.2e-16

4.1 Further variables/questions to explore on our own

Our dataset has many more variables, so here are some ideas on how you can extend yskimour analysis

  1. Are the number of bathrooms, bedrooms, beds, or size of the house (accomodates) significant predictors of price_4_nights? Or might these be co-linear variables?
# We create a dataset that contains flats with all accommodation capacities and generate a price for 4 nights variable. We also convert bathroom_text into a numeric variable
short_term_listings <- short_term_listings %>% 
  mutate(price_4_nights = price*4) %>% 
  mutate(log_price_4_nights = log(price_4_nights,10)) %>% 
  mutate(bathrooms_text = parse_number(bathrooms_text))


# We create a model that tests the effect of barhrooms, bedrooms and beds
model3 <- lm(log_price_4_nights ~ bathrooms + bedrooms + beds, data= short_term_listings)
summary(model3)

Call:
lm(formula = log_price_4_nights ~ bathrooms + bedrooms + beds, 
    data = short_term_listings)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.5659 -0.1735 -0.0005  0.1767  2.2170 

Coefficients:
            Estimate Std. Error  t value Pr(>|t|)    
(Intercept) 3.101475   0.002922 1061.566   <2e-16 ***
bathrooms   0.022764   0.002286    9.957   <2e-16 ***
bedrooms    0.091740   0.002294   39.995   <2e-16 ***
beds        0.018991   0.001448   13.111   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3266 on 24038 degrees of freedom
  (1163 observations deleted due to missingness)
Multiple R-squared:  0.3866,    Adjusted R-squared:  0.3866 
F-statistic:  5051 on 3 and 24038 DF,  p-value: < 2.2e-16
#We check for collinearity using a diagnostics test
vif(model3)
bathrooms  bedrooms      beds 
 3.888600  4.374824  3.785109 
# produce scatterplot-correlation matrix between all explanatory variables
short_term_listings %>%
  select(c(bedrooms, bathrooms, beds)) %>%
  ggpairs(alpha = 0.3) 

  1. Do superhosts (host_is_superhost) command a pricing premium, after controlling for other variables?
model_Superhost <- lm(log_price_4_nights ~ room_type + review_scores_rating + beds + host_is_superhost, data= short_term_listings)

summary(model_Superhost)

Call:
lm(formula = log_price_4_nights ~ room_type + review_scores_rating + 
    beds + host_is_superhost, data = short_term_listings)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1921 -0.1670 -0.0209  0.1493  2.1920 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)            3.1639188  0.0121336 260.757  < 2e-16 ***
room_typePrivate room -0.1874041  0.0046330 -40.450  < 2e-16 ***
room_typeShared room  -0.7208898  0.0142316 -50.654  < 2e-16 ***
review_scores_rating   0.0114577  0.0025435   4.505  6.7e-06 ***
beds                   0.0834936  0.0009063  92.126  < 2e-16 ***
host_is_superhostTRUE  0.0446902  0.0045234   9.880  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2789 on 16335 degrees of freedom
  (8864 observations deleted due to missingness)
Multiple R-squared:  0.4549,    Adjusted R-squared:  0.4548 
F-statistic:  2727 on 5 and 16335 DF,  p-value: < 2.2e-16
  1. Some hosts allow you to immediately book their listing (instant_bookable == TRUE), while a non-trivial proportion don’t. After controlling for other variables, is instant_bookable a significant predictor of price_4_nights?
# exploring raw materials and summary statistics for instant bookable
skim(short_term_listings$instant_bookable)
Data summary
Name short_term_listings$insta…
Number of rows 25205
Number of columns 1
_______________________
Column type frequency:
logical 1
________________________
Group variables None

Variable type: logical

skim_variable n_missing complete_rate mean count
data 0 1 0.69 TRU: 17458, FAL: 7747
# regression model fitting
model_instant <- lm(log_price_4_nights ~ room_type+review_scores_rating + beds + host_is_superhost + instant_bookable, data=short_term_listings)
summary(model_instant)

Call:
lm(formula = log_price_4_nights ~ room_type + review_scores_rating + 
    beds + host_is_superhost + instant_bookable, data = short_term_listings)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1698 -0.1659 -0.0218  0.1482  2.2201 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)            3.1439908  0.0123103 255.395  < 2e-16 ***
room_typePrivate room -0.1857387  0.0046258 -40.153  < 2e-16 ***
room_typeShared room  -0.7136245  0.0142212 -50.180  < 2e-16 ***
review_scores_rating   0.0092219  0.0025499   3.617 0.000299 ***
beds                   0.0835014  0.0009041  92.354  < 2e-16 ***
host_is_superhostTRUE  0.0374580  0.0045853   8.169 3.33e-16 ***
instant_bookableTRUE   0.0450179  0.0050617   8.894  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2782 on 16334 degrees of freedom
  (8864 observations deleted due to missingness)
Multiple R-squared:  0.4576,    Adjusted R-squared:  0.4574 
F-statistic:  2296 on 6 and 16334 DF,  p-value: < 2.2e-16
# check for collinearity using a diagnostics test
vif(model_instant)
                         GVIF Df GVIF^(1/(2*Df))
room_type            1.058736  2        1.014371
review_scores_rating 1.075508  1        1.037067
beds                 1.042461  1        1.021010
host_is_superhost    1.108619  1        1.052910
instant_bookable     1.058310  1        1.028742
  1. For all cities, there are 3 variables that relate to neighbourhoods: neighbourhood, neighbourhood_cleansed, and neighbourhood_group_cleansed. There are typically more than 20 neighbourhoods in each city, and it wouldn’t make sense to include them all in your model. Use your city knowledge, or ask someone with city knowledge, and see whether you can group neighbourhoods together so the majority of listings falls in fewer (5-6 max) geographical areas. You would thus need to create a new categorical variabale neighbourhood_simplified and determine whether location is a predictor of price_4_nights
#For Shanghai, we notice that the data in "neighbourhood" only consists "Shanghai, China" and "NA" while the data in "neighbourhood_group_cleansed" only consists "NA". "Neighbourhood_cleansed" represents different districts in Shanghai. There are altogether 16 districts in Shanghai, so we group different districts based on their distance from the city center and establish a scoring system. Intuitively, we would expect apartments that are in urban areas would have a higher price. For example, Huangpu and Jing'an are the districts nearest to the city center so they score 5.

#Huangpu, Jing'an - tier 1 districts (city center), score 5
#Changning, Xuhui, Yangpu, Hongkou, Putuo - tier 2 districts (urban area), score 4
#Pudong - tier 3 districts (Pudong is a large district, half in urban area, half on the outskirt), score 3
#Baoshan, Jiading, Minhang, Songjiang, Qingpu, Fengxian, Jinshan -tier 4 districts (outskirt of Shanghai), score 2
#Chongming - tier 5 districts (island in Shanghai), score 1

listings_neighbourhood <- short_term_listings %>%
  mutate(neighbourhood_simplified = 
          case_when(neighbourhood_cleansed %in% c("黄浦区 / Huangpu District", "静安区 / Jing'an District")~ 1,
                    neighbourhood_cleansed %in% c("长宁区 / Changning District", "徐汇区 / Xuhui District", "杨浦区 / Yangpu District", "虹口区 / Hongkou District","普陀区 / Putuo District")~2,
                    neighbourhood_cleansed %in% c("浦东新区 / Pudong")~3,
                    neighbourhood_cleansed %in% c("宝山区 / Baoshan District","嘉定区 / Jiading District","闵行区 / Minhang District","松江区 / Songjiang District","青浦区 / Qingpu District","奉贤区 / Fengxian District","金山区 / Jinshan District")~4,
                    neighbourhood_cleansed %in% c("崇明区 / Chongming District")~5))

#check if we cover all districts
unique(listings_neighbourhood$neighbourhood_simplified) 
[1] 2 1 3 5 4
# Neighbourhood_simplified is numeric. For the model to run correctly, it needs to be a factor variable.
listings_neighbourhood$neighbourhood_simplified <- as.factor(listings_neighbourhood$neighbourhood_simplified)

# final model
model_neighbourhood <- lm(log_price_4_nights ~ room_type+review_scores_rating + beds + host_is_superhost + instant_bookable+neighbourhood_simplified, data=listings_neighbourhood)

summary(model_neighbourhood)

Call:
lm(formula = log_price_4_nights ~ room_type + review_scores_rating + 
    beds + host_is_superhost + instant_bookable + neighbourhood_simplified, 
    data = listings_neighbourhood)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0756 -0.1626 -0.0252  0.1403  2.2534 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                3.2050812  0.0128273 249.864  < 2e-16 ***
room_typePrivate room     -0.1916452  0.0049393 -38.800  < 2e-16 ***
room_typeShared room      -0.7142902  0.0141357 -50.531  < 2e-16 ***
review_scores_rating       0.0075184  0.0025250   2.978  0.00291 ** 
beds                       0.0805626  0.0009561  84.265  < 2e-16 ***
host_is_superhostTRUE      0.0374241  0.0045912   8.151 3.86e-16 ***
instant_bookableTRUE       0.0397366  0.0050346   7.893 3.14e-15 ***
neighbourhood_simplified2 -0.0779772  0.0070067 -11.129  < 2e-16 ***
neighbourhood_simplified3 -0.0295876  0.0064338  -4.599 4.28e-06 ***
neighbourhood_simplified4 -0.1016003  0.0072010 -14.109  < 2e-16 ***
neighbourhood_simplified5  0.0574076  0.0113471   5.059 4.25e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2752 on 16330 degrees of freedom
  (8864 observations deleted due to missingness)
Multiple R-squared:  0.4695,    Adjusted R-squared:  0.4692 
F-statistic:  1445 on 10 and 16330 DF,  p-value: < 2.2e-16
  1. What is the effect of avalability_30 or reviews_per_month on price_4_nights, after we control for other variables?
# using the data from the previous model to check if availability affects the price

model_availability30 <- lm(log_price_4_nights ~ room_type+review_scores_rating + beds + host_is_superhost + instant_bookable+neighbourhood_simplified+availability_30+number_of_reviews, data=listings_neighbourhood)

summary(model_availability30)

Call:
lm(formula = log_price_4_nights ~ room_type + review_scores_rating + 
    beds + host_is_superhost + instant_bookable + neighbourhood_simplified + 
    availability_30 + number_of_reviews, data = listings_neighbourhood)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0252 -0.1622 -0.0266  0.1370  2.2244 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                3.174e+00  1.312e-02 241.985  < 2e-16 ***
room_typePrivate room     -1.937e-01  4.919e-03 -39.372  < 2e-16 ***
room_typeShared room      -7.245e-01  1.409e-02 -51.426  < 2e-16 ***
review_scores_rating       8.110e-03  2.519e-03   3.220  0.00128 ** 
beds                       8.012e-02  9.525e-04  84.123  < 2e-16 ***
host_is_superhostTRUE      3.962e-02  4.649e-03   8.524  < 2e-16 ***
instant_bookableTRUE       2.685e-02  5.126e-03   5.239 1.64e-07 ***
neighbourhood_simplified2 -8.110e-02  6.979e-03 -11.621  < 2e-16 ***
neighbourhood_simplified3 -3.789e-02  6.446e-03  -5.878 4.24e-09 ***
neighbourhood_simplified4 -1.116e-01  7.249e-03 -15.391  < 2e-16 ***
neighbourhood_simplified5  4.469e-02  1.136e-02   3.932 8.45e-05 ***
availability_30            2.558e-03  2.168e-04  11.801  < 2e-16 ***
number_of_reviews         -2.960e-04  6.436e-05  -4.599 4.27e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2738 on 16328 degrees of freedom
  (8864 observations deleted due to missingness)
Multiple R-squared:  0.4748,    Adjusted R-squared:  0.4744 
F-statistic:  1230 on 12 and 16328 DF,  p-value: < 2.2e-16

##Our Best Model

huxreg(model3, model_Superhost, model_instant, model_neighbourhood, model_availability30)
(1)(2)(3)(4)(5)
(Intercept)3.101 ***3.164 ***3.144 ***3.205 ***3.174 ***
(0.003)   (0.012)   (0.012)   (0.013)   (0.013)   
bathrooms0.023 ***                                
(0.002)                                   
bedrooms0.092 ***                                
(0.002)                                   
beds0.019 ***0.083 ***0.084 ***0.081 ***0.080 ***
(0.001)   (0.001)   (0.001)   (0.001)   (0.001)   
room_typePrivate room        -0.187 ***-0.186 ***-0.192 ***-0.194 ***
        (0.005)   (0.005)   (0.005)   (0.005)   
room_typeShared room        -0.721 ***-0.714 ***-0.714 ***-0.725 ***
        (0.014)   (0.014)   (0.014)   (0.014)   
review_scores_rating        0.011 ***0.009 ***0.008 ** 0.008 ** 
        (0.003)   (0.003)   (0.003)   (0.003)   
host_is_superhostTRUE        0.045 ***0.037 ***0.037 ***0.040 ***
        (0.005)   (0.005)   (0.005)   (0.005)   
instant_bookableTRUE                0.045 ***0.040 ***0.027 ***
                (0.005)   (0.005)   (0.005)   
neighbourhood_simplified2                        -0.078 ***-0.081 ***
                        (0.007)   (0.007)   
neighbourhood_simplified3                        -0.030 ***-0.038 ***
                        (0.006)   (0.006)   
neighbourhood_simplified4                        -0.102 ***-0.112 ***
                        (0.007)   (0.007)   
neighbourhood_simplified5                        0.057 ***0.045 ***
                        (0.011)   (0.011)   
availability_30                                0.003 ***
                                (0.000)   
number_of_reviews                                -0.000 ***
                                (0.000)   
N24042        16341        16341        16341        16341        
R20.387    0.455    0.458    0.470    0.475    
logLik-7206.826    -2317.836    -2278.365    -2095.955    -2014.487    
AIC14423.652    4649.672    4572.730    4215.910    4056.974    
*** p < 0.001; ** p < 0.01; * p < 0.05.

4.2 Diagnostics, collinearity, summary tables

As you keep building your models, it makes sense to:

  1. Check the residuals, using autoplot(model_x)

  2. As you start building models with more explanatory variables, make sure you use `car::vif(model_x)`` to calculate the Variance Inflation Factor (VIF) for your predictors and determine whether you have colinear variables. A general guideline is that a VIF larger than 5 or 10 is large, and your model may suffer from collinearity. Remove the variable in question and run your model again without it.

  3. Create a summary table, using huxtable (https://mfa2022.netlify.app/example/modelling_side_by_side_tables/) that shows which models you worked on, which predictors are significant, the adjusted \(R^2\), and the Residual Standard Error.

# Create a table that shows the models produced in this analysis
huxreg(model1, model2, model3)
(1)(2)(3)
(Intercept)3.159 ***3.163 ***3.101 ***
(0.016)   (0.015)   (0.003)   
prop_type_simplifiedEntire residential home0.008    0.008            
(0.012)   (0.012)           
prop_type_simplifiedEntire villa0.065 ***0.064 ***        
(0.018)   (0.017)           
prop_type_simplifiedOther-0.095 ***0.010            
(0.007)   (0.008)           
prop_type_simplifiedPrivate room in villa-0.009    0.146 ***        
(0.009)   (0.011)           
review_scores_rating0.019 ***0.018 ***        
(0.003)   (0.003)           
number_of_reviews0.000    -0.000            
(0.000)   (0.000)           
room_typePrivate room        -0.154 ***        
        (0.008)           
room_typeShared room        -0.402 ***        
        (0.027)           
bathrooms                0.023 ***
                (0.002)   
bedrooms                0.092 ***
                (0.002)   
beds                0.019 ***
                (0.001)   
N7860        7860        24042        
R20.046    0.108    0.387    
logLik146.818    410.394    -7206.826    
AIC-277.635    -800.789    14423.652    
*** p < 0.001; ** p < 0.01; * p < 0.05.
  1. Finally, you must use the best model you came up with for prediction. Suppose you are planning to visit the city you have been assigned to over reading week, and you want to stay in an Airbnb. Find Airbnb’s in your destination city that are apartments with a private room, have at least 10 reviews, and an average rating of at least 90. Use your best model to predict the total cost to stay at this Airbnb for 4 nights. Include the appropriate 95% interval with your prediction. Report the point prediction and interval in terms of price_4_nights.
# log(price) = 3.22 - 0.19*PrivateRoom - 0.72*SharedRoom + 0.01*ReviewScoresRating + 0.08*Beds + 0.04*Superhost + 0.05*InstantBookable -0.003*NumberOfReviews

# log(price) = 3.22 - 0.19*1 - 0.72*0 + 0.01*90 + 0.08*0 + 0.04*0 + 0.05*0 -0.003*10

#predicted_value <- 3.22 - 0.19*1 - 0.72*0 + 0.01*90 + 0.08*0 + 0.04*0 + 0.05*0 -0.003*10
#exp(predicted_value)

applied_filter <- listings_neighbourhood %>% 
  filter(room_type=="Private room") %>% 
  filter(number_of_reviews>=10) %>% 
  filter(review_scores_rating>=4.5)

dataframe <- predict(model_availability30, applied_filter)

10^dataframe
          1           2           3           4           5           6 
  1275.4949   1346.5120   1266.9092   1152.9268   1232.4632   1290.6690 
          7           8           9          10          11          12 
  1527.4583   1328.6223   1055.2231   1266.5385   1112.5901   1023.4277 
         13          14          15          16          17          18 
  1470.2526   1491.1061   1361.9551   2244.0738   1254.4833   1241.8753 
         19          20          21          22          23          24 
  1252.4811   1427.9473   1059.8279   1239.9096   1033.7987   1188.3356 
         25          26          27          28          29          30 
  1379.9411   1509.1593   1001.7281   1199.6359   1278.0716   1239.4750 
         31          32          33          34          35          36 
  1265.8990   1341.7781   1344.3542    994.9365   1236.9459   1340.3228 
         37          38          39          40          41          42 
  1284.7179   1130.1182   1127.7482   1286.8626   1346.1761   1094.8790 
         43          44          45          46          47          48 
  1023.2156   1453.9697   1502.6170   1177.5586   1092.3357   1375.4696 
         49          50          51          52          53          54 
  1112.2448   1299.8408   1249.2772   1466.7897   1428.8350   1427.7765 
         55          56          57          58          59          60 
  1191.9790   1212.7276   1098.5425   1256.1197   1461.8430   1476.8664 
         61          62          63          64          65          66 
  1243.3637   1331.4864   1349.6244   1336.4818   1537.1636   1186.3584 
         67          68          69          70          71          72 
  1211.1045    978.9252   1208.8643   1337.1491   1051.4769  12920.5103 
         73          74          75          76          77          78 
  1630.2401   1759.7858   1231.1624   1332.9888   1513.5606   1157.6274 
         79          80          81          82          83          84 
  1179.3057   1548.1918  11125.0625   1522.7418   1313.2731   1301.8893 
         85          86          87          88          89          90 
  1107.7698   2127.1264   1099.3073   1027.4517   1363.5661   1023.5860 
         91          92          93          94          95          96 
  1050.7414   1333.3550   1264.7306   1572.8629   1239.3559   1566.9995 
         97          98          99         100         101         102 
  1603.3765   1311.3665   2167.9078   1127.2700   1036.7542   1279.8258 
        103         104         105         106         107         108 
  1031.6869   1341.0623   1432.1550   1005.8615   1258.8788   1235.0554 
        109         110         111         112         113         114 
  1288.6782   1124.4084   1505.2307   1018.3595   1450.5588   1218.4467 
        115         116         117         118         119         120 
  1627.0898   1026.0759   1141.1901   1116.5573   1198.7846    968.1056 
        121         122         123         124         125         126 
  1353.1393   1206.9533   1410.4160   1201.0511   1118.9454   1242.6788 
        127         128         129         130         131         132 
  1327.8769   1755.1130   1329.8676   1316.2917   1354.3586   1011.1816 
        133         134         135         136         137         138 
  1140.1171   1461.6834   1435.5574   1141.5118   1306.8927   1337.6307 
        139         140         141         142         143         144 
  1237.7062   1111.8217    916.9180   1726.7432   1456.8156   1300.1927 
        145         146         147         148         149         150 
  1621.5604   1388.1089   1360.7711   1125.9506   1253.7392   1699.5005 
        151         152         153         154         155         156 
  2651.8205   1253.0039   1301.6344    997.2945   1707.4567    933.9124 
        157         158         159         160         161         162 
  1043.3878   1122.7841    921.1548   1581.9069    987.0457   1831.6469 
        163         164         165         166         167         168 
  1535.7438   1504.4754   1493.6973   1517.1293   1514.4827   1503.7311 
        169         170         171         172         173         174 
   986.1030   1423.2617   1180.4772   1475.1424   1617.4883   1273.3609 
        175         176         177         178         179         180 
  1371.8235   1240.0473   1019.0206   1724.8423   1185.9706   1204.1430 
        181         182         183         184         185         186 
  1422.9055    842.3246   1200.9727   1473.1094   1486.8785   1398.3788 
        187         188         189         190         191         192 
  1363.5372   1541.0093   1427.0955   1378.9295   1140.6812   1642.5919 
        193         194         195         196         197         198 
  1355.5757   1132.2118   1220.3562   1242.8800   1419.4534   1675.6970 
        199         200         201         202         203         204 
  1062.3404   1035.6634   1101.8051   1762.4338   1438.7043   1264.7175 
        205         206         207         208         209         210 
  1424.9498   1134.7723   1726.2116   1729.7612   1761.8908    980.9668 
        211         212         213         214         215         216 
   963.6088   1455.2854   1154.0909   1092.0684   1002.7776   1443.8856 
        217         218         219         220         221         222 
  1145.8671   1751.0990   1320.3731   1109.6041   1118.7360    996.3332 
        223         224         225         226         227         228 
  1469.9205   1265.5202   1724.5246   1178.1141   1262.4679   1280.4301 
        229         230         231         232         233         234 
  1537.0142   1409.7794    986.7082   1411.4209   1068.5323   3723.2077 
        235         236         237         238         239         240 
  1890.8069   1542.5322   1204.0464   1228.2785   1466.7458   1789.3862 
        241         242         243         244         245         246 
  1159.5924   1763.5002   1554.3579   1340.4348   2005.0977   1566.7120 
        247         248         249         250         251         252 
  2025.9392   1448.8384   1621.3316   1424.3835    938.9047   1112.9613 
        253         254         255         256         257         258 
  1876.0874   1485.4075   1357.0175   1365.1889   1307.7706   1256.9355 
        259         260         261         262         263         264 
  1351.0393   1119.4578   1419.4499   1277.5615   1301.0583   1648.0856 
        265         266         267         268         269         270 
  1225.1458   1542.4337   1195.3900   1270.7173   1297.6049   1263.5369 
        271         272         273         274         275         276 
  1083.4176   1128.7930   1254.1495   1168.1807   1159.5492   1324.6103 
        277         278         279         280         281         282 
   804.4566   1106.7701   1254.9654   1702.8972   5445.0019   1802.1388 
        283         284         285         286         287         288 
  1343.1007   1397.5969   1479.2432   1412.6356   1335.9093   1366.1313 
        289         290         291         292         293         294 
  1711.9649   1530.3522   1345.0192   1271.1836   1305.4438   1292.7944 
        295         296         297         298         299         300 
  1245.0310   1051.8707   1220.2147   1439.1521   1824.9451   2531.9214 
        301         302         303         304         305         306 
  1305.4089   1467.8522   3343.3587   1504.6610   1196.1845   1524.3826 
        307         308         309         310         311         312 
  1463.4812   1369.4691   1178.3603   1485.3624   1016.3975   1039.1992 
        313         314         315         316         317         318 
 15301.4290   1123.7513   1261.8228   1085.6546   1203.9921   1650.0322 
        319         320         321         322         323         324 
  1361.8384   1231.9249   3220.1038   2260.1221   1221.8727   1350.2201 
        325         326         327         328         329         330 
  1212.5969   1362.7699   1497.7242   2566.0564   1010.4192   1267.6275 
        331         332         333         334         335         336 
  1339.6089   1350.4951   1474.6646   1360.2170   1589.8285   1417.1322 
        337         338         339         340         341         342 
  1488.0610   1394.9228   1100.7274   1402.4764   1389.2444   1504.1574 
        343         344         345         346         347         348 
  1160.6850   1244.2816   1416.2450   1453.8508   1332.3660   1391.0245 
        349         350         351         352         353         354 
  1204.8816   1408.2456   1347.3629   1420.0998   1528.5363   1514.2907 
        355         356         357         358         359         360 
  1764.5378   1503.2516   1539.8646   1514.9582   1322.4862   1242.4295 
        361         362         363         364         365         366 
  1573.1378   1444.4310   1470.2661   1724.7461   1718.4571   1424.4524 
        367         368         369         370         371         372 
  1386.2105   1715.8072   2134.5435   1781.3374   1414.8517   1771.1076 
        373         374         375         376         377         378 
  1351.5205   1601.5654   1398.2240   1395.8638    995.9153   1212.3719 
        379         380         381         382         383         384 
  1067.4168   1557.5291   1508.1403   1444.0060   1234.0110   1765.0111 
        385         386         387         388         389         390 
  1434.9143   1368.9258   1249.2790   2162.4112   1789.9908   1428.9157 
        391         392         393         394         395         396 
  1425.8876   1475.5552    983.9554    972.9826   1421.4587   1473.8466 
        397         398         399         400         401         402 
  1723.2758   1470.4786   1794.4579   1471.6600   1737.7074   2502.2987 
        403         404         405         406         407         408 
  2062.5432   1396.3267   1018.9615   1234.6407   1483.9192   1485.3054 
        409         410         411         412         413         414 
  1015.7520   1323.5692   1359.1332   1367.0528   1378.5704   1411.1693 
        415         416         417         418         419         420 
  1679.5979   1394.0416   1364.3839   1776.1589   1477.3057   1783.6675 
        421         422         423         424         425         426 
  1756.2369   2135.8594   2109.3404   1768.2490   1465.9316   1478.6720 
        427         428         429         430         431         432 
  1037.3180   1407.5245   1026.1909   1192.0383   1422.8040   1209.8820 
        433         434         435         436         437         438 
  1022.5084   1328.2744   1247.6544   1571.5007   1707.5776   1627.4890 
        439         440         441         442         443         444 
  2020.7374   1394.0756   1253.4514   1498.8976   1593.0829   1466.6855 
        445         446         447         448         449         450 
  1178.4546   1421.2113   1327.2941   1424.1927   1489.2794   1628.3921 
        451         452         453         454         455         456 
  1620.2386   1383.3159   1372.5495   1363.2229   1572.7319   1372.0105 
        457         458         459         460         461         462 
  1238.7970   1898.5875   1793.1773   1340.0803   1062.5217   1199.3780 
        463         464         465         466         467         468 
  1714.5444   1283.2466   1457.0787   1749.0238   1375.3591   1555.6802 
        469         470         471         472         473         474 
  1329.0034   1035.5180   1103.9225   1281.7629   1255.5688   1593.3057 
        475         476         477         478         479         480 
  1279.2281   1368.7736   1366.8248   1862.6642   1351.7822   1303.4438 
        481         482         483         484         485         486 
  1337.6146   1556.5916   1856.2461   1332.5557   1666.6418   1282.3706 
        487         488         489         490         491         492 
  2055.6399   1508.6213    985.7487   1532.8121   1218.1094   1124.3992 
        493         494         495         496         497         498 
  1308.4000   1222.1741   1583.5230   1168.4008   1537.7668   1824.7368 
        499         500         501         502         503         504 
  1266.2248   1335.9646   1183.2345   1230.5743   1228.0451   1384.7836 
        505         506         507         508         509         510 
  1490.3057   1437.3756   1258.8662   1369.3883   1435.9294   1338.5226 
        511         512         513         514         515         516 
  1262.3857   1202.4203   1427.1163   1470.6493   1268.1608   1030.5578 
        517         518         519         520         521         522 
  1344.2850   1317.6361   1387.7570   1398.2015   1042.6687   1249.2707 
        523         524         525         526         527         528 
  1578.7252   1701.4041   1743.7700   1729.0309   1744.8602   1135.2259 
        529         530         531         532         533         534 
  1222.4602   1475.7510   1167.3115   1823.1396   1122.7135   1271.6489 
        535         536         537         538         539         540 
  1142.9393   1176.0804   1498.2922   1505.1633   1224.0730   1087.4987 
        541         542         543         544         545         546 
  1379.6650   1186.4852   1803.8153   1732.8998   1229.0848   1731.8414 
        547         548         549         550         551         552 
  1458.2430   1589.8250   1811.0559   1403.5066   1702.0547   1429.4905 
        553         554         555         556         557         558 
  1421.8335   1472.4847   1915.6039   1343.9482   1252.5336   1537.4070 
        559         560         561         562         563         564 
  1390.9889   1878.5414   1862.1947   1504.4893   1568.5332   1420.8944 
        565         566         567         568         569         570 
  1558.1114   1397.3755   1540.9217   1811.3260   1501.2088   1399.6496 
        571         572         573         574         575         576 
  1368.1495   1434.8707   1487.7277   1165.6239   1261.7074   1217.0343 
        577         578         579         580         581         582 
  1840.9782   1424.0714   1252.3950   1432.9991   1145.3051   1109.4439 
        583         584         585         586         587         588 
  1220.5532   1220.5532   1471.7707   1213.9950   2104.4523   1790.7922 
        589         590         591         592         593         594 
  1336.7768   1495.2453   1258.8662   1336.4129   1378.1308   1382.5840 
        595         596         597         598         599         600 
  1420.1221   1526.7390   1524.6475   1307.5424   1790.0877   1449.9160 
        601         602         603         604         605         606 
  1484.0082   1457.8364   1449.9974   1314.3703   1736.4152   1302.8501 
        607         608         609         610         611         612 
  1683.5857   1149.7145   1446.9086   1030.7723   1342.0979   2269.2781 
        613         614         615         616         617         618 
  1149.1317   1744.2257   1860.1668   1560.6119   1734.3354   1500.0480 
        619         620         621         622         623         624 
  1352.4121   1354.5100   1214.7871   1190.8506   1641.2684   1676.1796 
        625         626         627         628         629         630 
  1532.7578   3254.7600   1315.5215   1378.0472   1094.2382   1234.9406 
        631         632         633         634         635         636 
  1504.8437   1451.5345    998.7203   1734.9027   1311.2715   1216.6685 
        637         638         639         640         641         642 
  1540.2457   1169.5597   1529.0186   1909.7994   1486.1240   1564.6045 
        643         644         645         646         647         648 
  1328.5134   1420.5038   1678.9861   1632.6409   1343.7737   1245.0989 
        649         650         651         652         653         654 
  1133.3196   1801.3412   1503.8044   1162.5009   1388.6785   2260.5833 
        655         656         657         658         659         660 
  1149.8723   1537.7668   1631.9990   1139.2504   1498.7018   1330.4515 
        661         662         663         664         665         666 
  1475.6780   1474.9371   1793.7879   1456.6761   1466.7351   1465.3798 
        667         668         669         670         671         672 
  1388.9850   1164.7698   1783.2371   1404.3794   1472.3777   1456.9325 
        673         674         675         676         677         678 
  1125.9331   1434.5526  34629.0195   1252.4409   1120.1370   1519.3863 
        679         680         681         682         683         684 
  1238.4959   1453.7875   1464.0122   1420.1092   1474.6725   1681.5907 
        685         686         687         688         689         690 
  1429.3273   1502.2383   1522.8634   2192.2636   1886.1516   1526.6534 
        691         692         693         694         695         696 
  1283.3027   1448.5217   1883.5820   1298.8296   1220.8372   2224.6804 
        697         698         699         700         701         702 
  1117.8256   1109.0194   2614.9508   1694.0468   1125.7860   1344.2552 
        703         704         705         706         707         708 
  1324.2907   1284.2402   1286.9556   1367.3318   1468.1430   1454.8057 
        709         710         711         712         713         714 
  1373.7551   1696.2923   1415.2187   1684.3354   1207.2888   1203.1235 
        715         716         717         718         719         720 
  1763.9433   2092.6129   1470.6594   1217.3426   1518.8242   1508.8949 
        721         722         723         724         725         726 
  1789.5650   1671.0228   1143.9673   1813.3726   1442.7323   1435.8955 
        727         728         729         730         731         732 
  1556.6687   1767.4189   1878.4534   1553.8689   1377.6695   1649.5107 
        733         734         735         736         737         738 
  1423.7902   2071.8204   1664.5966   1144.6381   1502.0881   1190.4800 
        739         740         741         742         743         744 
  1279.0915   1857.2786   1341.1255   1819.3293   1544.5697   1529.6173 
        745         746         747         748         749         750 
  1904.9551   1571.5122   1517.0575   1557.7355   1574.7291   1586.5808 
        751         752         753         754         755         756 
  1887.3384   1313.6681   1397.8450   1692.6778   1517.3164   1489.9113 
        757         758         759         760         761         762 
  1491.7489   1485.4155   1447.0847   1457.0775   1548.4892   1918.4678 
        763         764         765         766         767         768 
  1329.4452   1460.3171   1426.2962   1438.7051   1211.8386   1444.5850 
        769         770         771         772         773         774 
  1733.6916   1473.0398   1174.9674   1713.1405   2140.0153   1776.1589 
        775         776         777         778         779         780 
  1477.6673   1470.3317   1453.3106   1587.6187   1469.2610   1262.5399 
        781         782         783         784         785         786 
  1021.1803    957.4248   1443.5933   1310.3044   1375.6129   1495.9993 
        787         788         789         790         791         792 
  1289.8287   1251.5875   1798.1505   1529.9918   1770.1796   1169.7502 
        793         794         795         796         797         798 
  1166.2898   1203.2367   1206.2678   1423.4810   1458.7037   1466.6988 
        799         800         801         802         803         804 
 15462.0911   1790.2658   1846.3755   1520.8991   1467.0040   1500.3139 
        805         806         807         808         809         810 
  1527.5916   1868.0849   1564.9760   1389.3683   1357.7972   1398.5268 
        811         812         813         814         815         816 
  2676.9227   1429.6081   1442.4184   1451.1316   1567.3157   1561.2979 
        817         818         819         820         821         822 
  1521.9362   1518.6428   1528.1593   1328.3069   1510.2139   1501.7477 
        823         824         825         826         827         828 
  1544.0825   1592.8748   1545.3200   1907.6608   1913.2441   1530.0000 
        829         830         831         832         833         834 
  1586.8478   1919.7760   1913.2441   1493.1401   1256.0834   1421.9454 
        835         836         837         838         839         840 
  1260.4852   1330.9589   1373.9969   1174.0677   1564.1957   1513.1024 
        841         842         843         844         845         846 
  1839.5795   2233.6542   1443.3035   1437.6601   1513.1821   1262.7059 
        847         848         849         850         851         852 
  1513.1821   1503.0852   1759.2160   1507.2875   1462.5745   1798.4539 
        853         854         855         856         857         858 
  1489.9221   1785.1457   1430.4833   1484.3954   1349.1845   1385.1461 
        859         860         861         862         863         864 
  1347.2937   1349.0435   1377.7732   1384.7997   1343.7104   1345.3703 
        865         866         867         868         869         870 
  1167.4483   1427.2751   1538.9708   1509.3357   1473.1471   1137.6984 
        871         872         873         874         875         876 
  1010.8701   1492.0715   1167.6663   1237.4724   1256.3496   1555.6079 
        877         878         879         880         881         882 
  1529.0186   1840.0503   1293.9598   1577.6494   1448.0637   1836.3725 
        883         884         885         886         887         888 
  1616.1326   1537.3905   1281.3706   1191.6386   1753.8059   1167.7967 
        889         890         891         892         893         894 
  1386.2431   1297.4318   1163.9108   1297.8718   1424.5524   1055.5405 
        895         896         897         898         899         900 
  1405.7939   1442.5021   1441.6266   1415.4754   1415.5886   1396.0719 
        901         902         903         904         905         906 
  1413.2169   1428.9712   1251.0365   1541.9653   1564.2071   1369.3847 
        907         908         909         910         911         912 
  1468.9030   1289.0663    946.0260   1129.5133   1334.1445   1132.5889 
        913         914         915         916         917         918 
  1369.0012   1625.4935   1615.7600   1649.8617   1619.3580   1373.9341 
        919         920         921         922         923         924 
  1362.6473   1424.9038   1691.8462   1941.2849   1596.3440   1918.4678 
        925         926         927         928         929         930 
  1584.0200   1251.6613   1459.6837   1122.6928   1309.8887   1583.4078 
        931         932         933         934         935         936 
  1396.7544   1644.5433   1661.6549   1239.5162   1358.3561   1433.6492 
        937         938         939         940         941         942 
  1687.6558   1354.2527   1715.8213   1423.2139   1521.0148   1583.3397 
        943         944         945         946         947         948 
  1529.4205   1241.9069   1249.0192   1577.9526   1575.8912   1290.0692 
        949         950         951         952         953         954 
  1072.4912   5057.6743   1340.9503   1425.2811   1780.4472   1405.8118 
        955         956         957         958         959         960 
  1468.9686   1500.5912   1717.3037   1702.2953   1441.4460   1730.9194 
        961         962         963         964         965         966 
  1577.0513   1251.7279   1480.4741   1450.2364   1405.3370   1270.8844 
        967         968         969         970         971         972 
  1258.1681   1531.8624   1270.2276   1414.2721   1219.7567   1645.3270 
        973         974         975         976         977         978 
  1758.5942   1274.1760   1270.5498   1102.6408   1635.3829   1862.3257 
        979         980         981         982         983         984 
  1239.6478   1636.1350   1500.0937   1470.1238   1337.4641   1813.8820 
        985         986         987         988         989         990 
  1763.2521   1884.8664   1569.4539   1558.3994   1533.4990   1534.8088 
        991         992         993         994         995         996 
  1491.1919   1855.1901   1820.8903   1530.7409   1505.9028   1848.4255 
        997         998         999        1000        1001        1002 
  2251.9080   2225.9638   1525.2397   1824.2902   1507.2794   1516.0955 
       1003        1004        1005        1006        1007        1008 
  2566.0558   1519.1849   1350.3442   1265.6899   1255.1795   1208.2704 
       1009        1010        1011        1012        1013        1014 
  1297.2230   1225.6862   1060.0070   2021.2899   1443.9955   1561.1137 
       1015        1016        1017        1018        1019        1020 
  1533.6905   1579.5006   1841.1012   1818.9728   1842.2264   2174.5585 
       1021        1022        1023        1024        1025        1026 
  1849.1188   1461.1989   1870.3150   1908.0347   1395.1114   1595.9381 
       1027        1028        1029        1030        1031        1032 
  1053.8400   1314.0438   1158.7674   1518.1719   1566.3555   1541.9653 
       1033        1034        1035        1036        1037        1038 
  1493.0692   1478.1386   1850.1387   1197.7896   1532.9099   1148.0207 
       1039        1040        1041        1042        1043        1044 
  1356.3169   1455.9426   1785.8970   1507.6397   1520.0362   2152.6489 
       1045        1046        1047        1048        1049        1050 
  2093.5908   1798.1049   1437.6637   1357.0407   1741.8291   1454.9505 
       1051        1052        1053        1054        1055        1056 
  1762.1467   1735.1445   1435.4244   1455.8503   1905.4177    960.4524 
       1057        1058        1059        1060        1061        1062 
  1761.8083   1580.8612   1292.8065   1451.8020   1569.7554   1570.6320 
       1063        1064        1065        1066        1067        1068 
  1509.0649   1520.3446   1914.5487   1585.4850   1548.0204   1519.5071 
       1069        1070        1071        1072        1073        1074 
  1748.4319   1751.6804   1030.2141   1518.2378   1464.6148   1454.2200 
       1075        1076        1077        1078        1079        1080 
  1472.4524   1450.1523   1501.9439   1486.8597   1519.1989   1482.9001 
       1081        1082        1083        1084        1085        1086 
  1488.2320   1413.1674   1439.2495   1118.8827   1328.1401   1180.0650 
       1087        1088        1089        1090        1091        1092 
  1389.7193   1642.9412   1351.7393   1676.8992   1347.6506   1657.6727 
       1093        1094        1095        1096        1097        1098 
  1681.9803   1337.3339   1347.3990   1649.0644   1689.2200   1394.7481 
       1099        1100        1101        1102        1103        1104 
  1764.5344   1478.0017   1474.2260   1377.6551   1470.4758   1371.1506 
       1105        1106        1107        1108        1109        1110 
  1405.9137   1532.5347   1888.2625   1463.2029   1559.1739   1859.8000 
       1111        1112        1113        1114        1115        1116 
  1866.6026   1267.1706   1527.6943   1559.1906   1810.1015   1529.4616 
       1117        1118        1119        1120        1121        1122 
  1816.5150   1603.4744   1242.4580   1194.6010   1204.1177   1402.1003 
       1123        1124        1125        1126        1127        1128 
  1429.8508   1439.9392   1478.0928   1341.1914   1728.7392   1411.4494 
       1129        1130        1131        1132        1133        1134 
  1707.9862   1421.0207   1705.7550   1405.8592   1416.0961   1416.0961 
       1135        1136        1137        1138        1139        1140 
  1421.6340   1962.2685   1715.2386   1299.0474   1376.7029   1772.9064 
       1141        1142        1143        1144        1145        1146 
  1389.2233   1369.6180   1394.4794   1702.6771   1354.4483   1853.1290 
       1147        1148        1149        1150        1151        1152 
  1736.3589   2226.5283   2304.0215   1379.0032   1660.4425   1383.6181 
       1153        1154        1155        1156        1157        1158 
  1605.2800   1377.8963   1524.1612   2209.1636   1378.6975   1238.1567 
       1159        1160        1161        1162        1163        1164 
  1432.5975   1733.1295   1766.2051   1465.6456   1413.7236   1519.5597 
       1165        1166        1167        1168        1169        1170 
  1499.2426   1396.2566   1709.7443   1692.6746   1487.7213   1830.3988 
       1171        1172        1173        1174        1175        1176 
  1488.4242   1177.8093   1530.8215   1127.6248   1274.5039   1900.2471 
       1177        1178        1179        1180        1181        1182 
  1823.9398   1507.2794   1575.9852   1576.5744   1348.4058   1322.2061 
       1183        1184        1185        1186        1187        1188 
  1523.9076   1313.5673   1447.0019   1548.2031   1678.4657   1530.8215 
       1189        1190        1191        1192        1193        1194 
  1217.1172   1256.3136   1625.2970   1716.9158   1717.1740   1296.4910 
       1195        1196        1197        1198        1199        1200 
  1576.5744   1510.8333   1552.2386   1498.0800   1883.5820   1546.2816 
       1201        1202        1203        1204        1205        1206 
  1889.5500   1392.1313   1410.5241   1361.3665   1056.2681   1228.6544 
       1207        1208        1209        1210        1211        1212 
   966.9584   1383.9707   1439.6025   1429.0514   1278.3109   1666.3168 
       1213        1214        1215        1216        1217        1218 
  1380.3089   1372.8721   1657.6727   1403.5791   1366.5896   1407.9446 
       1219        1220        1221        1222        1223        1224 
  1406.5414   1383.0812   1401.7660   1524.7446   1520.7946   1069.7347 
       1225        1226        1227        1228        1229        1230 
  1332.6973   1337.6567   1353.7773   1368.7563   1808.3107   1244.3157 
       1231        1232        1233        1234        1235        1236 
  1552.0384   1845.1011   1526.2768   1513.9230   1686.3678   1681.9803 
       1237        1238        1239        1240        1241        1242 
  1403.6735   1679.7986   1559.3715   1370.9897   1344.7729   1387.5797 
       1243        1244        1245        1246        1247        1248 
  1686.9187   1406.5468   1683.8815   1555.2272   1808.9827   1511.3070 
       1249        1250        1251        1252        1253        1254 
  1862.7930   1507.6675   1507.1862   1567.3127   1862.3200   1558.1114 
       1255        1256        1257        1258        1259        1260 
  1552.8011   1538.9708   1545.3758   1507.9131   1538.7201   1183.0257 
       1261        1262        1263        1264        1265        1266 
  1846.3557   1816.5150   1256.9849   1818.9735   1566.2820   1562.7881 
       1267        1268        1269        1270        1271        1272 
  1298.7784   1022.3997   1914.5980   1337.7513   1329.9891   1486.3907 
       1273        1274        1275        1276        1277        1278 
  1536.6156   1848.3950   1541.2859   1537.8584   1535.4683   1535.2960 
       1279        1280        1281        1282        1283        1284 
  1139.3034   1154.9651   1510.6359   1195.4721   1200.8573   1222.5116 
       1285        1286        1287        1288        1289        1290 
  1277.1802   1496.4812   1524.1009   1572.8514   1529.7784   1557.2184 
       1291        1292        1293        1294        1295        1296 
  1503.1776   1547.2177   1542.6995   1197.8327   1208.7965   1291.9160 
       1297        1298        1299        1300        1301        1302 
  1279.6317   1177.1818   1466.7826   1186.4464   1192.8792   1432.0045 
       1303        1304        1305        1306        1307        1308 
  1571.5122   2252.0585   1777.4566   1531.2015   1896.8295   1921.0850 
       1309        1310        1311        1312        1313        1314 
  1020.3353   1131.2261   1573.2479   1592.6463   1528.8159   1874.2277 
       1315        1316        1317        1318        1319        1320 
  5997.9837   1510.3730   1543.1229   1509.6065   1734.0134   1366.0061 
       1321        1322        1323        1324        1325        1326 
  1198.8558   1510.8333   1512.8943   1810.7582   1481.9864   1508.7750 
       1327        1328        1329        1330        1331        1332 
  1498.4000   1597.5784   1617.7021   1585.5410   1685.0302   1346.1890 
       1333        1334        1335        1336        1337        1338 
  1231.6133   1482.1580   1499.0715   1598.8427   1432.0612   1355.1365 
       1339        1340        1341        1342        1343        1344 
  1107.1538   1650.0396   1335.0542   1393.3381   1286.0786   1508.9906 
       1345        1346        1347        1348        1349        1350 
  1234.6407   1823.6006   1582.3229   1833.2677   1821.7188   1516.8594 
       1351        1352        1353        1354        1355        1356 
  1509.1738   1522.4327    961.5202   1470.9726   1398.6106   1581.1826 
       1357        1358        1359        1360        1361        1362 
   990.0803   1390.3268   1321.4490   1832.2036   2114.9799   1832.2001 
       1363        1364        1365        1366        1367        1368 
  1724.8486   1359.1980   1326.3021   1278.5284   1501.0156   1777.8242 
       1369        1370        1371        1372        1373        1374 
  1506.5059   1468.8743   1516.2825   1385.2368   1405.2843   1342.2715 
       1375        1376        1377        1378        1379        1380 
  1439.8236   1069.2379   1438.3718   1336.4146   1508.3043   1394.0240 
       1381        1382        1383        1384        1385        1386 
  1497.4147   1515.3203   1530.8215   1501.5030   1511.4795   5486.9771 
       1387        1388        1389        1390        1391        1392 
  1463.2067   1497.7922   1539.1110   1515.2493   1558.7923   3155.1099 
       1393        1394        1395        1396        1397        1398 
  1297.3263   1037.7834   1585.3841   2116.3529   1545.7969   1670.1993 
       1399        1400        1401        1402        1403        1404 
  1839.7237   1518.7227   1804.4794   1507.3779   1496.3863   1440.9083 
       1405        1406        1407        1408        1409        1410 
  1303.8025   1134.4524   1053.1778   1671.9034   1444.6412   1455.6942 
       1411        1412        1413        1414        1415        1416 
  1389.8175   1027.1090   1558.4048   1558.7102   1236.2597   1326.0015 
       1417        1418        1419        1420        1421        1422 
  1057.0398   1683.2418   1745.5180   1261.4734   1258.4312   1732.9511 
       1423        1424        1425        1426        1427        1428 
  1515.2741   1375.2731   2136.3092   1471.4491   1779.0589   1507.2516 
       1429        1430        1431        1432        1433        1434 
  1800.5378   2093.1846   1497.1293   1516.2767   1235.2607   1231.0579 
       1435        1436        1437        1438        1439        1440 
  1381.1005   1359.5111   1842.3467   1491.1162   1518.3510   1698.5159 
       1441        1442        1443        1444        1445        1446 
  1874.2474   1586.5115   1558.7749   1178.9345   1187.6168   1628.9976 
       1447        1448        1449        1450        1451        1452 
  1421.8843   1553.9418   1478.3181   1776.6257   1443.6315   1056.5168 
       1453        1454        1455        1456        1457        1458 
  1844.2952   1533.5797   1466.4562   1158.0888   1323.2123   1569.4539 
       1459        1460        1461        1462        1463        1464 
  1548.5821   1565.7507   1546.4724   1800.9800   1866.1494   1545.4186 
       1465        1466        1467        1468        1469        1470 
  1900.9218   1186.2203   1549.9357   1594.4522   1895.0031   1843.6030 
       1471        1472        1473        1474        1475        1476 
  1395.5906   1506.7247   1170.5849   1530.4469   1517.9514   1527.2923 
       1477        1478        1479        1480        1481        1482 
  1822.7620   1527.5916   1530.4045   1673.5126   1392.4146   1532.9929 
       1483        1484        1485        1486        1487        1488 
  1537.7668   1512.0272   1506.0556   1498.7018   1832.0579   1817.8192 
       1489        1490        1491        1492        1493        1494 
  1519.7413   2013.9519   1663.0480   1382.5167   1390.0112   1665.0026 
       1495        1496        1497        1498        1499        1500 
  1663.6846   1388.4277   1383.5525   1382.4009   1681.5168   1820.7811 
       1501        1502        1503        1504        1505        1506 
  1269.4332   1523.8707   1529.8695   1821.9001   1817.7200   1529.9553 
       1507        1508        1509        1510        1511        1512 
  1515.6254   1537.9476   1516.5709   1531.1157   1574.4266   1495.2012 
       1513        1514        1515        1516        1517        1518 
  1551.7520   1490.6706   1428.4286   1433.5883   1441.5003   1518.1638 
       1519        1520        1521        1522        1523        1524 
  1828.3393   1589.9177   1687.0709   1654.8381   1543.8755   1562.0707 
       1525        1526        1527        1528        1529        1530 
  1880.3470   1481.6256   1554.3581   1416.3922   1580.6809   1293.3568 
       1531        1532        1533        1534        1535        1536 
  1538.0481   1527.7036   1896.0010   1890.8385   1566.9322   1513.4489 
       1537        1538        1539        1540        1541        1542 
  1308.0549   1586.4117   1588.9313   1779.7412   1863.1373   1964.9544 
       1543        1544        1545        1546        1547        1548 
  1301.9230   1898.5875   1562.9502   1227.0449   1536.9799   1487.5585 
       1549        1550        1551        1552        1553        1554 
  1422.2268   1319.6748   1713.3669   1572.9095   1407.8396   1145.3447 
       1555        1556        1557        1558        1559        1560 
  1860.2553   1846.2386   1530.0723   1554.9285   1399.4009   2117.8847 
       1561        1562        1563        1564        1565        1566 
  1228.4288   1197.7789   1381.5399   1235.5802   1346.0526   1283.6817 
       1567        1568        1569        1570        1571        1572 
  1505.5013   1248.1382   1279.1083   1298.4876   1445.8159   2143.5169 
       1573        1574        1575        1576        1577        1578 
  2252.9559   1546.2816 110093.2790   1656.9366   1484.0652   1908.6051 
       1579        1580        1581        1582        1583        1584 
  1415.1967   1719.1166   1417.4218   1203.9762   1500.4989   1490.3893 
       1585        1586        1587        1588        1589        1590 
  1815.9553   1514.9716   1503.4722   2194.7285   1751.2612   1303.0814 
       1591        1592        1593        1594        1595        1596 
  1301.0022   1613.4846   1366.3902   1421.3464   1523.8274   1490.6706 
       1597        1598        1599        1600        1601        1602 
  1404.3565   1860.2553   1821.9204   1486.8894   1815.1621   1462.2166 
       1603        1604        1605        1606        1607        1608 
  1845.1011   1291.5998   3126.3055   1258.4312   1132.6213   1549.8116 
       1609        1610        1611        1612        1613        1614 
  1359.4357   1867.0659   1537.3905   1765.0490   1523.1618   1819.4188 
       1615        1616        1617        1618        1619        1620 
  1533.9522   1553.8689   1556.5556   1544.7436   1530.0582   1531.9595 
       1621        1622        1623        1624        1625        1626 
  1813.3301   1528.7360   1556.8977   1873.6503   1514.2080   2179.8004 
       1627        1628        1629        1630        1631        1632 
  1532.9092   1559.8377   1226.3874   1863.1509   1552.4128   1848.3950 
       1633        1634        1635        1636        1637        1638 
  1557.2298   1517.2085   1890.5878   2234.0765   1293.2024   1562.9850 
       1639        1640        1641        1642        1643        1644 
  1408.2892   1205.6761   1511.8635   1506.7195   1171.1649   1490.5572 
       1645        1646        1647        1648        1649        1650 
  1802.5800   1338.6993   1353.9264   1532.9099   1333.0904   1280.5486 
       1651        1652        1653        1654        1655        1656 
  1368.5251   1619.5845   1315.1400   1371.6530   1649.8873   1647.9441 
       1657        1658        1659        1660        1661        1662 
  1356.7774   1359.7072   1196.9034   1627.2898   2382.6137   2848.1191 
       1663        1664        1665        1666        1667        1668 
  1494.5292   1449.0093   2656.2372   1487.6929   1478.2079   1384.8923 
       1669        1670        1671        1672        1673        1674 
  1809.1530   1462.3520   1581.7349   1532.9099   1541.9653   1239.3111 
       1675        1676        1677        1678        1679        1680 
  1553.3842   1364.9481   1753.3484   1395.7343   1311.0796   1478.9590 
       1681        1682        1683        1684        1685        1686 
  1519.0034   1566.2478   1543.6910   1536.2425   1526.6534   1837.2174 
       1687        1688        1689        1690        1691        1692 
  1523.9076   1864.0632   1537.5659   1539.4878   1857.7210   1544.3655 
       1693        1694        1695        1696        1697        1698 
  1534.6254   1303.6946   1436.2218   1901.3180   1571.5908   1858.9029 
       1699        1700        1701        1702        1703        1704 
  2286.5402   1576.6618   1694.2283   1287.7678   1247.5499   1355.5489 
       1705        1706        1707        1708        1709        1710 
  1165.9600   1436.6712   1144.9192   1801.3864   1293.7504   1396.2817 
       1711        1712        1713        1714        1715        1716 
  1441.3934   1733.9435   1773.1926   1422.6783   1736.8601   1742.7958 
       1717        1718        1719        1720        1721        1722 
  1739.5443   1403.9241   1437.2942   1411.6826   1409.7773   1289.2466 
       1723        1724        1725        1726        1727        1728 
  1681.5907   1526.6534   1511.4934   1839.2565   1828.8065   1513.1740 
       1729        1730        1731        1732        1733        1734 
  1828.0112   1854.3798   1543.6910   1853.9260   1226.4956   1581.1783 
       1735        1736        1737        1738        1739        1740 
  1345.3774   1093.7892   1552.6098   1400.2154   1247.9753   1771.0974 
       1741        1742        1743        1744        1745        1746 
  1484.4490   1466.7165   1481.0568   1475.0148   1462.6330   1441.0063 
       1747        1748        1749        1750        1751        1752 
  1421.0568   1781.2471   1458.4543   1444.3412   1470.9714   1399.0001 
       1753        1754        1755        1756        1757        1758 
  2039.3912   1484.1775   1326.4015   1691.9320   1363.5852   1391.5344 
       1759        1760        1761        1762        1763        1764 
  1388.7494   1659.1071   1407.5032   1373.9422   1375.1232   1365.8069 
       1765        1766        1767        1768        1769        1770 
  1376.7801   1532.9099   1526.6534   1534.5110   1558.1114   2250.0795 
       1771        1772        1773        1774        1775        1776 
  1551.7520   1462.3756   1501.8596   1504.0096   1514.9553   1519.7413 
       1777        1778        1779        1780        1781        1782 
  1506.1650   1514.9582   1520.3120   1826.7789   1850.1387   1746.1828 
       1783        1784        1785        1786        1787        1788 
  1695.8990   1403.1439   1432.6038   1773.4331   1754.5311   1770.2598 
       1789        1790        1791        1792        1793        1794 
  1764.2401   1411.6159   1507.8315   1177.2351   1417.7659   1404.3006 
       1795        1796        1797        1798        1799        1800 
  1395.7930   1508.8457   1531.8654   1521.9240   1853.5627   1548.9612 
       1801        1802        1803        1804        1805        1806 
  1843.4861   1646.1999   1648.1226   1633.2011   1358.1006   1371.2377 
       1807        1808        1809        1810        1811        1812 
  1538.8154   9583.6834   1549.2475   2237.9728   1494.4505   1432.1415 
       1813        1814        1815        1816        1817        1818 
  1479.1545   1487.5202   1442.3003   1797.1373   1402.0087   1476.8527 
       1819        1820        1821        1822        1823        1824 
  1865.3307   1532.3398   2231.6116   1468.7360   2684.0654   1545.9002 
       1825        1826        1827        1828        1829        1830 
  1544.7459   1511.5915   1561.7761   1847.0418   1804.3844   1532.2344 
       1831        1832        1833        1834        1835        1836 
  1513.8462   1523.1618   1836.7678   1502.4397   1519.1989   1530.4469 
       1837        1838        1839        1840        1841        1842 
  1509.3438   1503.2597   1523.2361   1181.7111   1482.1556   1796.8866 
       1843        1844        1845        1846        1847        1848 
  1158.5555   1507.7469   1822.2509   1755.4827   1828.9197   1836.6174 
       1849        1850        1851        1852        1853        1854 
  1840.5276   1523.1618   1492.5906   1505.2260   1516.5738   1513.1850 
       1855        1856        1857        1858        1859        1860 
  1217.0455   1528.0736   1813.5768   1537.0142   1521.2463   1525.1401 
       1861        1862        1863        1864        1865        1866 
  1543.3132   1508.5750   1541.0093   1378.5810   1818.1790   1667.3165 
       1867        1868        1869        1870        1871        1872 
  1515.2464   1512.1429   1498.3408   1474.4050   1574.4266   1564.9760 
       1873        1874        1875        1876        1877        1878 
  1558.3994   1531.9595   1552.4742   1580.4920   1518.2519   1876.3535 
       1879        1880        1881        1882        1883        1884 
  1530.4469   1766.2112   1784.9125   1791.7922   1469.7482   1801.9096 
       1885        1886        1887        1888        1889        1890 
  1472.8394   1448.1427   1490.8407   1541.0206   1492.3956   1494.6429 
       1891        1892        1893        1894        1895        1896 
  1816.4955   1502.6874   1817.2759   1431.2201   1512.5241   1510.4635 
       1897        1898        1899        1900        1901        1902 
  1505.0860   1805.4538   1495.0197   1567.3157   1566.2478   1565.8644 
       1903        1904        1905        1906        1907        1908 
  1578.7252   1572.3553   1574.4266   1139.0302   1509.3298   1833.2346 
       1909        1910        1911        1912        1913        1914 
  1446.2990   1555.6079   1571.2984   1585.8878   1538.8154   1810.7380 
       1915        1916        1917        1918        1919        1920 
  1498.7046   1490.8975   1493.0615   1792.2178   1479.5029   1396.6235 
       1921        1922        1923        1924        1925        1926 
  1434.4463   1557.3261   1551.0743   1554.7253   1585.8878   1826.3319 
       1927        1928        1929        1930        1931        1932 
  2201.6751   1515.8095   1820.2201   1493.1424   1499.8326   1493.7002 
       1933        1934        1935        1936        1937        1938 
  1571.2103   1343.3018   1765.5456   1901.7562   1257.5140   1836.2046 
       1939        1940        1941        1942        1943        1944 
  1880.3237   1806.4266   1572.8400   1552.6146   1169.9687   1555.3145 
       1945        1946        1947        1948        1949        1950 
  1522.7891   1543.2124   1526.8440   1847.4975   1823.7211   2139.2663 
       1951        1952        1953        1954        1955        1956 
  1820.6790   1838.5966   1518.7227   1206.4296   1777.3986   1696.2509 
       1957        1958        1959        1960        1961        1962 
  1416.7637   3016.1613   1022.8980   1718.2557   1413.9488   1469.7189 
       1963        1964        1965        1966        1967        1968 
  1405.3761   1261.2495   1433.7391   1527.4948   1858.9877   1758.6362 
       1969        1970        1971        1972        1973        1974 
  1424.3274   1852.1958   1450.3551   1332.1314   1546.8337   1856.2090 
       1975        1976        1977        1978        1979        1980 
  1855.8453   1863.9206   1855.7412   1538.9821   1550.6947   1866.6026 
       1981        1982        1983        1984        1985        1986 
  1847.2526   1375.4744   1353.6393   1540.1321   1759.1877   1529.2930 
       1987        1988        1989        1990        1991        1992 
  1527.2923   1535.7580   1256.1353   1514.1609   1464.2777   1726.0541 
       1993        1994        1995        1996        1997        1998 
  1800.8353   1845.5726   1821.4612   1813.9315   1519.9702   1528.6555 
       1999        2000        2001        2002        2003        2004 
  1841.7826   1521.0645   1521.4591   1854.3798   1854.3798   2206.5660 
       2005        2006        2007        2008        2009        2010 
  1851.4529   1255.1975   1216.7865   1489.6279   2091.1179   2604.0945 
       2011        2012        2013        2014        2015        2016 
  1804.9405   1788.8810   1596.3440   1580.1052   1531.9595   1463.9492 
       2017        2018        2019        2020        2021        2022 
  1597.4325   1585.4997   1450.4421   1540.9146   1577.2633   1525.1594 
       2023        2024        2025        2026        2027        2028 
  1514.6037   1560.3329   1573.3538   1543.6097   1549.0696   1428.3246 
       2029        2030        2031        2032        2033        2034 
  1525.3938   1840.3869   1533.6628   1544.8302   1532.1597   1850.7786 
       2035        2036        2037        2038        2039        2040 
  1134.6643   1845.3081   1845.3081   1473.3398   1502.2492   1699.4818 
       2041        2042        2043        2044        2045        2046 
  1979.0128   1426.4493   1481.3550   1489.0746   1503.4561   1507.3779 
       2047        2048        2049        2050        2051        2052 
  1502.2492   1811.3260   1837.2174   1823.4934   1750.1761   1474.0205 
       2053        2054        2055        2056        2057        2058 
  1830.5050   1816.0509   1575.5001   2484.5946   2440.1840   1280.9614 
       2059        2060        2061        2062        2063        2064 
  1198.3124   1116.4817   1512.1539   1429.8619   1549.6380   1814.5498 
       2065        2066        2067        2068        2069        2070 
  2621.7636   1454.1660   1840.9782   1836.6174   1827.5505   1836.0685 
       2071        2072        2073        2074        2075        2076 
  2196.8626   1831.8390   1515.3203   1509.2342   1322.0509   1323.2736 
       2077        2078        2079        2080        2081        2082 
  1522.4965   1521.4451   1526.1800   1215.9090   1516.1555   1876.9305 
       2083        2084        2085        2086        2087        2088 
  1300.8386   1517.6878   1532.1597   2263.1132   1908.5018   1898.5875 
       2089        2090        2091        2092        2093        2094 
  1426.8117   1575.7798   1514.9582   1775.4090   1574.2519   1571.5774 
       2095        2096        2097        2098        2099        2100 
  1504.0096   1481.8036   1523.9076   1502.5078   1496.1966   1785.1169 
       2101        2102        2103        2104        2105        2106 
  1525.7428   1540.9146   1560.2370   1820.6594   1420.6395   1868.6952 
       2107        2108        2109        2110        2111        2112 
  1129.0842   1548.5821   1347.0902   1298.8298   1038.4911   1523.9211 
       2113        2114        2115        2116        2117        2118 
  2694.2597   2298.5557   1566.4908   1996.2584   1992.7905   1579.3138 
       2119        2120        2121        2122        2123        2124 
  1524.9438   1507.7631   1823.8438   1518.6428   1522.7891   1545.0404 
       2125        2126        2127        2128        2129        2130 
  1530.0723   1832.5670   1474.4934   1516.5738   1508.3263   1387.8096 
       2131        2132        2133        2134        2135        2136 
  1122.1022   1552.8972   1553.9561   1539.4878   1555.9887   1555.9887 
       2137        2138        2139        2140        2141        2142 
  1122.1022   1964.9423   1343.5277   1120.7647   1795.3789   1335.6543 
       2143        2144        2145        2146        2147        2148 
  1463.7124   1142.6745   1456.7344   1764.5147   1470.6278   1488.4127 
       2149        2150        2151        2152        2153        2154 
  1963.3468   1446.9676   1979.7049   1492.1032   1220.6709   1547.9046 
       2155        2156        2157        2158        2159        2160 
  1729.0561   1393.6765   1841.1812   1843.7963   1496.1966   1556.4132 
       2161        2162        2163        2164        2165        2166 
  1532.9099   1165.1957   1823.1396   2224.0197   2175.8758   2214.5084 
       2167        2168        2169        2170        2171        2172 
  1265.2474   1276.2333   1257.0748   1089.3350   1401.3794   1543.6761 
       2173        2174        2175        2176        2177        2178 
  1233.1546   1413.5065   1712.8282   1402.9763   1579.7864   1435.6533 
       2179        2180        2181        2182        2183        2184 
  1312.9814   1470.6335   1482.4324   1786.6513   1488.4242   1504.0096 
       2185        2186        2187        2188        2189        2190 
  1767.8197   1703.1796   1699.4782   1338.6993   1467.6292   1473.7265 
       2191        2192        2193        2194        2195        2196 
  1465.1647   1452.4046   1775.2801   1619.8094   1294.0508   1280.2863 
       2197        2198        2199        2200        2201        2202 
  1610.1013   1897.8780   1506.0585   1616.0353   1601.6738   1471.0433 
       2203        2204        2205        2206        2207        2208 
  1452.7489   1591.8548   1594.7260   1748.4319   1488.4242   1478.3130 
       2209        2210        2211        2212        2213        2214 
  1479.0312   1766.1861   1756.6797   1476.6604   1459.0104   1760.1769 
       2215        2216        2217        2218        2219        2220 
  1760.8181   1472.5299   1772.9903   1776.1718   1459.6479   1537.5659 
       2221        2222        2223        2224        2225        2226 
  1339.2036   1522.8692   1498.8924   1523.9076   1521.6017   1498.3298 
       2227        2228        2229        2230        2231        2232 
  1328.6993   1534.6254   1469.2610   1467.6157   1769.3228   1467.6980 
       2233        2234        2235        2236        2237        2238 
  1775.2801   1752.8333   1306.9048   1362.6571   1324.5192   1868.6607 
       2239        2240        2241        2242        2243        2244 
  1308.1160   1725.4669   1100.8896   1209.2040   1241.0944   1391.1552 
       2245        2246        2247        2248        2249        2250 
  1393.1467   1208.3675   1283.1999   2210.6700   1548.1805   1847.6068 
       2251        2252        2253        2254        2255        2256 
  1854.3762   1537.0030   1536.5182   1540.7968   1502.2383   1461.6953 
       2257        2258        2259        2260        2261        2262 
  1458.0722          NA   1469.0741   1484.7399   1483.7282   1487.0293 
       2263        2264        2265        2266        2267        2268 
  1492.0224   1478.3073   1484.7399   1464.7105   1470.2736   1482.6340 
       2269        2270        2271        2272        2273        2274 
  1542.7733   1008.5180   1440.7747   1459.0787   1461.0007   1462.9939 
       2275        2276        2277        2278        2279        2280 
  1459.0104   1457.6353   1758.6362   1292.7969   1439.3910   1452.2530 
       2281        2282        2283        2284        2285        2286 
  1858.9877   1548.9612   1872.8665   1874.2487   1866.2504   1555.9714 
       2287        2288        2289        2290        2291        2292 
  1337.1203   1291.1661   1317.2878   1093.1674   1149.2046   1514.1778 
       2293        2294        2295        2296        2297        2298 
  1564.9786   1199.4968   2369.7529   1520.7946   3850.7467   1509.4992 
       2299        2300        2301        2302        2303        2304 
  1180.2312   1298.5659   1311.3383   1322.9592   1589.9177   1532.2344 
       2305        2306        2307        2308        2309        2310 
  1106.2375   1268.0210   1125.3578   1289.2768   1241.9406   1460.8059 
       2311        2312        2313        2314        2315        2316 
  1412.4735   1203.9762   1323.6167   1488.5753   1686.1513   1571.0076 
       2317        2318        2319        2320        2321        2322 
  1515.2493   1526.6534   1538.5109   1847.7914   1532.1031   1513.1850 
       2323        2324        2325        2326        2327        2328 
  1516.6537   1597.3677   1839.7237   1543.0138   1543.6910   1429.6081 
       2329        2330        2331        2332        2333        2334 
  1481.4680   1265.9208   1533.5797   1830.9665   1384.2672   1370.9897 
       2335        2336        2337        2338        2339        2340 
  1821.9009   1826.4280   1828.9197   1826.4280   1828.9197   1513.9259 
       2341        2342        2343        2344        2345        2346 
  1494.1583   1611.7322   1524.5735   1838.4701   1294.9597   1376.8318 
       2347        2348        2349        2350        2351        2352 
  1523.6915   1540.9146   1506.0614   1497.2168   1518.7227   1499.0228 
       2353        2354        2355        2356        2357        2358 
  1531.4905   1526.7249   1522.8692   1512.8943   1491.9278   1196.4792 
       2359        2360 
  1521.0645   1282.6056 

5 Deliverables

  • By midnight on Monday 17 Oct 2022, you must upload on Canvas a short presentation (max 4-5 slides) with your findings, as some groups will be asked to present in class. You should present your Exploratory Data Analysis, as well as your best model. In addition, you must upload on Canvas your final report, written using R Markdown to introduce, frame, and describe your story and findings. You should include the following in the memo:
  1. Executive Summary: Based on your best model, indicate the factors that influence price_4_nights. This should be written for an intelligent but non-technical audience. All other sections can include technical writing.
  2. Data Exploration and Feature Selection: Present key elements of the data, including tables and graphs that help the reader understand the important variables in the dataset. Describe how the data was cleaned and prepared, including feature selection, transformations, interactions, and other approaches you considered.
  3. Model Selection and Validation: Describe the model fitting and validation process used. State the model you selected and why they are preferable to other choices.
  4. Findings and Recommendations: Interpret the results of the selected model and discuss additional steps that might improve the analysis

Remember to follow R Markdown etiquette rules and style; don’t have the Rmd output extraneous messages or warnings, include summary tables in nice tables (use kableExtra), and remove any placeholder texts from past Rmd templates; in other words, (i.e. I don’t want to see stuff I wrote in your final report.)

6 Rubric

Your work will be assessed on a rubric which you can find here

7 Acknowledgements